{ "cells": [ { "cell_type": "markdown", "metadata": { "id": "EFyh_Kfx5jkw" }, "source": [ "**Chapter 15 – Processing Sequences Using RNNs and CNNs**\n", "\n", "**Chapter 16 – Natural Language Processing with RNNs and Attention**" ] }, { "cell_type": "markdown", "metadata": { "id": "TQs1xzgo5jkz" }, "source": [ "_This notebook contains the sample from https://github.com/ageron/handson-ml2/ and https://github.com/fchollet/deep-learning-with-python-notebooks_" ] }, { "cell_type": "markdown", "metadata": { "id": "qIV46eeo5jkz" }, "source": [ "\n", " \n", " \n", "
\n", " \"Open\n", " \n", " \n", "
" ] }, { "cell_type": "markdown", "metadata": { "id": "PdY3OIJ15jk0" }, "source": [ "# Setup" ] }, { "cell_type": "markdown", "metadata": { "id": "ykB4PPYw5jk0" }, "source": [ "First, let's import a few common modules, ensure MatplotLib plots figures inline and prepare a function to save the figures. We also check that Python 3.5 or later is installed (although Python 2.x may work, it is deprecated so we strongly recommend you use Python 3 instead), as well as Scikit-Learn ≥0.20 and TensorFlow ≥2.0." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "0cmPr4J_5jk1" }, "outputs": [], "source": [ "# Python ≥3.5 is required\n", "import sys\n", "assert sys.version_info >= (3, 5)\n", "\n", "# Is this notebook running on Colab or Kaggle?\n", "IS_COLAB = \"google.colab\" in sys.modules\n", "IS_KAGGLE = \"kaggle_secrets\" in sys.modules\n", "\n", "# Scikit-Learn ≥0.20 is required\n", "import sklearn\n", "assert sklearn.__version__ >= \"0.20\"\n", "\n", "# TensorFlow ≥2.0 is required\n", "import tensorflow as tf\n", "from tensorflow import keras\n", "assert tf.__version__ >= \"2.0\"\n", "\n", "if not tf.config.list_physical_devices('GPU'):\n", " print(\"No GPU was detected. LSTMs and CNNs can be very slow without a GPU.\")\n", " if IS_COLAB:\n", " print(\"Go to Runtime > Change runtime and select a GPU hardware accelerator.\")\n", " if IS_KAGGLE:\n", " print(\"Go to Settings > Accelerator and select GPU.\")\n", "\n", "# Common imports\n", "import numpy as np\n", "import os\n", "from pathlib import Path\n", "\n", "# to make this notebook's output stable across runs\n", "np.random.seed(42)\n", "tf.random.set_seed(42)\n", "\n", "# To plot pretty figures\n", "%matplotlib inline\n", "import matplotlib as mpl\n", "import matplotlib.pyplot as plt\n", "mpl.rc('axes', labelsize=14)\n", "mpl.rc('xtick', labelsize=12)\n", "mpl.rc('ytick', labelsize=12)\n", "\n", "# Where to save the figures\n", "PROJECT_ROOT_DIR = \".\"\n", "CHAPTER_ID = \"rnn\"\n", "IMAGES_PATH = os.path.join(PROJECT_ROOT_DIR, \"images\", CHAPTER_ID)\n", "os.makedirs(IMAGES_PATH, exist_ok=True)\n", "\n", "def save_fig(fig_id, tight_layout=True, fig_extension=\"png\", resolution=300):\n", " path = os.path.join(IMAGES_PATH, fig_id + \".\" + fig_extension)\n", " print(\"Saving figure\", fig_id)\n", " if tight_layout:\n", " plt.tight_layout()\n", " plt.savefig(path, format=fig_extension, dpi=resolution)" ] }, { "cell_type": "markdown", "metadata": { "id": "eAE6jniY5jk2" }, "source": [ "# Basic RNNs for forecasting times series" ] }, { "cell_type": "markdown", "metadata": { "id": "PhK1Eyvw5jk3" }, "source": [ "## Generate the Dataset" ] }, { "cell_type": "markdown", "source": [ "Suppose you are studying the number of active users per hour on your website, or the daily temperature in your city, or your company’s financial health, measured quarterly using multiple metrics. In all these cases, the data will be a sequence of one or more values per time step. This is called a time series. In the first two examples there is a single value per time step, so these are **univariate time series**, while in the financial example there are multiple values per time step (e.g., the company’s revenue, debt, and so on), so it is a **multivariate time series**. A typical task is to predict future values, which is called forecasting. Another common task is to fill in the blanks: to predict (or rather “postdict”) missing values from the past. This is called imputation. \n", "\n", "For simplicity, we are using a time series generated by the generate_time_series() function, shown here:" ], "metadata": { "id": "ZNbWWTJm9_WN" } }, { "cell_type": "code", "execution_count": 3, "metadata": { "id": "4XG7SRs35jk3" }, "outputs": [], "source": [ "def generate_time_series(batch_size, n_steps):\n", " freq1, freq2, offsets1, offsets2 = np.random.rand(4, batch_size, 1)\n", " time = np.linspace(0, 1, n_steps)\n", " series = 0.5 * np.sin((time - offsets1) * (freq1 * 10 + 10)) # wave 1\n", " series += 0.2 * np.sin((time - offsets2) * (freq2 * 20 + 20)) # + wave 2\n", " series += 0.1 * (np.random.rand(batch_size, n_steps) - 0.5) # + noise\n", " return series[..., np.newaxis].astype(np.float32)" ] }, { "cell_type": "markdown", "source": [ "The function returns a NumPy array of shape `[batch size, time steps, 1]`, where each series is the sum of two sine waves of fixed amplitudes but random frequencies and phases, plus a bit of noise. \n", "\n", "When dealing with time series (and other types of sequences such as sentences), the input features are generally represented as 3D arrays of shape `[batch size, time steps, dimensionality]`,where dimensionality is 1 for univariate time series and more for multivariate\n", "time series.\n" ], "metadata": { "id": "3mT-BlwO-QMs" } }, { "cell_type": "code", "execution_count": 4, "metadata": { "id": "jGFqAitJ5jk4" }, "outputs": [], "source": [ "np.random.seed(42)\n", "\n", "n_steps = 50\n", "series = generate_time_series(10000, n_steps + 1)\n", "X_train, y_train = series[:7000, :n_steps], series[:7000, -1]\n", "X_valid, y_valid = series[7000:9000, :n_steps], series[7000:9000, -1]\n", "X_test, y_test = series[9000:, :n_steps], series[9000:, -1]" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "id": "UXAjbgOt5jk4", "outputId": "5de60e8b-2b38-47ac-e541-a42fcd6ba328", "colab": { "base_uri": "https://localhost:8080/" } }, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "((7000, 50, 1), (7000, 1))" ] }, "metadata": {}, "execution_count": 5 } ], "source": [ "X_train.shape, y_train.shape" ] }, { "cell_type": "markdown", "source": [ "`X_train` contains 7,000 time series (i.e., its shape is `[7000, 50, 1]`), while `X_valid` contains 2,000 (from the 7,000th time series to the 8,999th) and X_test contains 1,000 (from the 9,000 to the 9,999 ). **Since we want to\n", "forecast a single value for each series**, the targets are column vectors (e.g., `y_train` has a shape of `[7000, 1]`)." ], "metadata": { "id": "LXt8l_8F-2Ed" } }, { "cell_type": "code", "execution_count": 6, "metadata": { "id": "_LQXcP-r5jk4", "outputId": "27d1aa45-0690-437f-b77a-ae48446f7b66", "colab": { "base_uri": "https://localhost:8080/", "height": 314 } }, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "Saving figure time_series_plot\n" ] }, { "output_type": "display_data", "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" } } ], "source": [ "def plot_series(series, y=None, y_pred=None, x_label=\"$t$\", y_label=\"$x(t)$\", legend=True):\n", " plt.plot(series, \".-\")\n", " if y is not None:\n", " plt.plot(n_steps, y, \"bo\", label=\"Target\")\n", " if y_pred is not None:\n", " plt.plot(n_steps, y_pred, \"rx\", markersize=10, label=\"Prediction\")\n", " plt.grid(True)\n", " if x_label:\n", " plt.xlabel(x_label, fontsize=16)\n", " if y_label:\n", " plt.ylabel(y_label, fontsize=16, rotation=0)\n", " plt.hlines(0, 0, 100, linewidth=1)\n", " plt.axis([0, n_steps + 1, -1, 1])\n", " if legend and (y or y_pred):\n", " plt.legend(fontsize=14, loc=\"upper left\")\n", "\n", "fig, axes = plt.subplots(nrows=1, ncols=3, sharey=True, figsize=(12, 4))\n", "for col in range(3):\n", " plt.sca(axes[col])\n", " plot_series(X_valid[col, :, 0], y_valid[col, 0],\n", " y_label=(\"$x(t)$\" if col==0 else None),\n", " legend=(col == 0))\n", "save_fig(\"time_series_plot\")\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": { "id": "mVW_UPLp5jk5" }, "source": [ "## Computing Some Baselines" ] }, { "cell_type": "markdown", "metadata": { "id": "opK47Oz95jk5" }, "source": [ "Before we start using RNNs, it is often a good idea to have a few baseline metrics, or else we may end up thinking our model works great when in fact it is doing worse than basic models. The simplest approach is to predict the last value in each series. This is called *naive forecasting*" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "id": "zQIzWhQQ5jk5", "outputId": "be199c4b-cff9-45d6-91bd-ba16b2191cd9", "colab": { "base_uri": "https://localhost:8080/" } }, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "0.020211367" ] }, "metadata": {}, "execution_count": 7 } ], "source": [ "y_pred = X_valid[:, -1]\n", "np.mean(keras.losses.mean_squared_error(y_valid, y_pred))" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "id": "4y1kxSFw5jk6", "outputId": "0f0b9c0d-9df8-46ae-917e-4bb3610496cf", "colab": { "base_uri": "https://localhost:8080/", "height": 293 } }, "outputs": [ { "output_type": "display_data", "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" } } ], "source": [ "plot_series(X_valid[0, :, 0], y_valid[0, 0], y_pred[0, 0])\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": { "id": "9Zi14gly5jk6" }, "source": [ "Another simple approach is to use a fully connected network. Since it expects a flat list of features for each input, we need to add a `Flatten layer`. Let’s just use a simple Linear Regression model so that each prediction will be a linear combination of the values in the time series:" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "id": "Tr1yTxkj5jk6", "outputId": "7368b75f-6b0c-4019-9113-96586a7c23f2", "colab": { "base_uri": "https://localhost:8080/" } }, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "Epoch 1/20\n", "219/219 [==============================] - 3s 5ms/step - loss: 0.1001 - val_loss: 0.0545\n", "Epoch 2/20\n", "219/219 [==============================] - 1s 3ms/step - loss: 0.0379 - val_loss: 0.0266\n", "Epoch 3/20\n", "219/219 [==============================] - 1s 4ms/step - loss: 0.0202 - val_loss: 0.0157\n", "Epoch 4/20\n", "219/219 [==============================] - 1s 4ms/step - loss: 0.0131 - val_loss: 0.0116\n", "Epoch 5/20\n", "219/219 [==============================] - 1s 3ms/step - loss: 0.0103 - val_loss: 0.0098\n", "Epoch 6/20\n", "219/219 [==============================] - 1s 3ms/step - loss: 0.0089 - val_loss: 0.0087\n", "Epoch 7/20\n", "219/219 [==============================] - 1s 3ms/step - loss: 0.0080 - val_loss: 0.0079\n", "Epoch 8/20\n", "219/219 [==============================] - 1s 3ms/step - loss: 0.0073 - val_loss: 0.0071\n", "Epoch 9/20\n", "219/219 [==============================] - 1s 3ms/step - loss: 0.0066 - val_loss: 0.0066\n", "Epoch 10/20\n", "219/219 [==============================] - 1s 4ms/step - loss: 0.0061 - val_loss: 0.0062\n", "Epoch 11/20\n", "219/219 [==============================] - 1s 3ms/step - loss: 0.0057 - val_loss: 0.0057\n", "Epoch 12/20\n", "219/219 [==============================] - 1s 3ms/step - loss: 0.0054 - val_loss: 0.0055\n", "Epoch 13/20\n", "219/219 [==============================] - 1s 3ms/step - loss: 0.0052 - val_loss: 0.0052\n", "Epoch 14/20\n", "219/219 [==============================] - 1s 4ms/step - loss: 0.0049 - val_loss: 0.0049\n", "Epoch 15/20\n", "219/219 [==============================] - 1s 3ms/step - loss: 0.0048 - val_loss: 0.0048\n", "Epoch 16/20\n", "219/219 [==============================] - 1s 3ms/step - loss: 0.0046 - val_loss: 0.0048\n", "Epoch 17/20\n", "219/219 [==============================] - 1s 3ms/step - loss: 0.0045 - val_loss: 0.0045\n", "Epoch 18/20\n", "219/219 [==============================] - 1s 3ms/step - loss: 0.0044 - val_loss: 0.0044\n", "Epoch 19/20\n", "219/219 [==============================] - 1s 3ms/step - loss: 0.0043 - val_loss: 0.0043\n", "Epoch 20/20\n", "219/219 [==============================] - 1s 3ms/step - loss: 0.0042 - val_loss: 0.0042\n" ] } ], "source": [ "np.random.seed(42)\n", "tf.random.set_seed(42)\n", "\n", "model = keras.models.Sequential([\n", " keras.layers.Flatten(input_shape=[50, 1]),\n", " keras.layers.Dense(1)\n", "])\n", "\n", "model.compile(loss=\"mse\", optimizer=\"adam\")\n", "history = model.fit(X_train, y_train, epochs=20,\n", " validation_data=(X_valid, y_valid))" ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "id": "swohpJKX5jk6", "outputId": "df2883b3-43eb-432e-a2cf-5188ae484c6a", "colab": { "base_uri": "https://localhost:8080/" } }, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "63/63 [==============================] - 0s 2ms/step - loss: 0.0042\n" ] }, { "output_type": "execute_result", "data": { "text/plain": [ "0.004168086219578981" ] }, "metadata": {}, "execution_count": 10 } ], "source": [ "model.evaluate(X_valid, y_valid)" ] }, { "cell_type": "markdown", "source": [ "If we compile this model using the MSE loss and the default Adam optimizer, then fit it on the training set for 20 epochs and evaluate it on the validation set, we get an MSE of about 0.004. That’s much better than the naive approach!" ], "metadata": { "id": "nDRQiwp7_7Zd" } }, { "cell_type": "code", "execution_count": 11, "metadata": { "id": "kckEzMwk5jk7", "outputId": "cbed5fa3-36cc-4f07-dc94-eb94a923594e", "colab": { "base_uri": "https://localhost:8080/", "height": 291 } }, "outputs": [ { "output_type": "display_data", "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" } } ], "source": [ "def plot_learning_curves(loss, val_loss):\n", " plt.plot(np.arange(len(loss)) + 0.5, loss, \"b.-\", label=\"Training loss\")\n", " plt.plot(np.arange(len(val_loss)) + 1, val_loss, \"r.-\", label=\"Validation loss\")\n", " plt.gca().xaxis.set_major_locator(mpl.ticker.MaxNLocator(integer=True))\n", " plt.axis([1, 20, 0, 0.05])\n", " plt.legend(fontsize=14)\n", " plt.xlabel(\"Epochs\")\n", " plt.ylabel(\"Loss\")\n", " plt.grid(True)\n", "\n", "plot_learning_curves(history.history[\"loss\"], history.history[\"val_loss\"])\n", "plt.show()" ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "id": "HhHs0TEs5jk7", "outputId": "8ec7db41-b8ce-4486-8629-5cf8ff537d14", "colab": { "base_uri": "https://localhost:8080/", "height": 293 } }, "outputs": [ { "output_type": "display_data", "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAZwAAAEUCAYAAAAfooCMAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjIsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+WH4yJAAAgAElEQVR4nOzdeVyVVf7A8c/3sgqoIAIuKKgI7hsuqJloaqvltFhp2zRl0zLNTNM201ROU1NTTb9mKVumpqmsxkpbJ7eEXHIDFRXBDUFFFlkE2QTuPb8/7oW5ICDLXeG8X6/npfc85zn33Id77/c+55znHFFKoWmapmn2ZnB2BTRN07SuQQccTdM0zSF0wNE0TdMcQgccTdM0zSF0wNE0TdMcQgccTdM0zSF0wNE0TdMcwiUDjog8ICJJInJORN67QN5fi0iuiJSKyLsi4mO1L1JEEkSkQkTSRWSO3SuvaZqmNcklAw5wCngWeLelTCJyKfA4cAkQAQwG/mCV5WNgNxAMPAF8JiIh9qiwpmma1jJx5ZkGRORZIFwpdUcz+z8CMpVSv7M8vgRYrpTqIyLRwD6gt1LqrGX/Jsv+NxzyAjRN07R6ns6uQAeNBL60epwChIlIsGVfRl2wsdo/sqmCRGQJsATA19c3duDAgfapsRsxmUwYDK56Eew4+jzoc1BHn4eWz8GhQ4cKlFLNtiK5e8AJAEqsHtf9v3sT++r292+qIKXUW8BbADExMergwYO2rakbSkxMJD4+3tnVcDp9HvQ5qKPPQ8vnQESyWjrW3UN1GdDD6nHd/882sa9u/1k0TdM0h3P3gJMKjLV6PBbIU0oVWvYNFpHujfanOrB+mqZpmoVLBhwR8RQRX8AD8BARXxFpqvnvfeBnIjJCRAKB3wPvASilDgF7gKctx/8EGAN87pAXoWmapjXgkgEHc+CoxDzk+RbL/38vIgNFpExEBgIopVYDLwIJwHEgC3jaqpybgIlAMfACcL1S6rTDXoWmaZpWzyUHDSillgJLm9kd0CjvK8ArzZSTCcTbrmaapmlae7lkwHF1paWl5OfnU1NT4+yq2FXPnj1JS0tzdjUcxt/fn/Dw8C4/7FXT7EUHnDYqLS0lLy+P/v37061bN0TE2VWym7Nnz9K9e/cLZ+wETCYT2dnZFBQUEBoa6uzqaFqnpH/KtVF+fj79+/fHz8+vUwebrsZgMBAWFkZJSeNbtzRNsxUdcNqopqaGbt26Obsamh14eXlRW1vr7GpoWqelA0476Cubzkn/XTXNvnTA0TRN0xxCBxxN0zTNIXTA0TRN0xxCB5wuQERa3O644w6n1S0yMpKXX37Zac+vaZrj6IDjJMuXQ2QkGAzmf5cvt99z5eTk1G9vv/32eWl//etf21RedXW1PaqpaVonpwOOEyxfDkuWQFYWKGX+d8kS+wWdPn361G+BgYEN0srLy7ntttvo06cP/v7+TJgwgW+++abB8ZGRkSxdupQ777yTwMBAFi9eDMC7777LwIED8fPzY/78+bz++uvnjfT6+uuviY2NxdfXl0GDBvHEE0/UB6z4+HiysrJ45JFH6q+2NE3rvHTAcYInnoCKioZpFRXmdEcrKyvj8ssvZ926daSkpHDddddx7bXXkp6e3iDfK6+8wrBhw0hKSuJPf/oTW7du5a677uL+++9nz549XH311Tz99NMNjlmzZg2LFy/mgQceIDU1lXfffZfPPvuM3/3udwCsXLmS8PBwnnrqqfqrLU3TOjGllN4abdHR0ao5Bw4caHZfa4koZb62abiJdLjoC/r000+V+c/evClTpqg//vGPqrS0VCmlVEREhLrqqqsa5LnpppvUpZde2iDt7rvvblD2jBkz1DPPPNMgz6pVq5S/v78ymUz1Zb/00kvtfj221tTfNyEhwfEVcTH6HJjp89DyOQCSVAvfrfoKxwkGDmxbuj2Vl5fz6KOPMmLECIKCgggICCApKYnjx483yDdx4sQGj9PT05k8eXKDtClTpjR4nJyczHPPPUdAQED9tmjRIsrLy8nNzbXPC9I0zWXpyTud4LnnzH021s1qfn7mdEd7+OGHWb16NS+//DJDhw7Fz8+P22677byBAf7+/m0u22Qy8fTTT3PDDTecty8kJKTdddY0zT3pgOMElj53nngCjh83X9k899z/0h1p8+bN3HbbbVx33XUAVFVVcfToUaKjo1s8btiwYezcubNB2o4dOxo8njBhAunp6URFRTVbjre3N0ajsZ211zTNneiA4ySLFzsnwDQWHR3NqlWruOaaa/Dy8uIPf/gDVVVVFzzuwQcf5KKLLuKll15iwYIFbNy4kVWrVjXI89RTT3HVVVcRERHBwoUL8fT0ZP/+/ezYsYMXX3wRMI+A27RpE7fccgs+Pj707t3bLq9T0zTnc9k+HBHpJSKrRKRcRLJEZFEz+b6zLDtdt1WLyD6r/ZkiUmm1f63jXoXre+WVVwgNDWXGjBlcfvnlxMXFMWPGjAseN3XqVN5++23+9re/MWbMGL744gsee+wxfH196/NceumlfPvttyQkJDB58mQmT57MCy+8wECrzqpnnnmGEydOMGTIEN3MpmmdnCtf4bwGVANhwDjgWxFJUUqlWmdSSl1u/VhEEoENjcqar5Rab8e6uo3rr78e82ASs4iICNavb3hqHn74YcC8ABtAZmZmk2Xdeeed3HnnnfWPf/3rX5/XfDZv3jzmzZvXbH3i4uJISUlp02vQNM09uWTAERF/4DpglFKqDNgsIl8BtwKPt3BcJDADuMP+tdReeukl5s6dS0BAAOvXr+eNN97gT3/6k7OrpWmai3LJgANEA7VKqUNWaSnAzAscdxuwSSmV2Sh9uYgYgN3AI0op/ZPaBpKSknj55ZcpKSlh0KBBPP/88/zyl790drU0TXNRYt284ipEZAbwqVKqj1Xa3cBipVR8C8cdAZ5VSr1nlTYd2AUI8EvLNkwpdabRsUuAJQAhISGxK1asaPI5evbs2eKoq87EaDTi4eHh7Go41JEjR85bZrqsrIyAgIAGaYeLa0krNDEi2IOooM5/jpo6B12RPg8tn4NZs2YlK6UmNrkT173CKQN6NErrAZxt7gARuQjoA3xmna6U2mL18HkRuR1zs9vXjfK9BbwFEBMTo+Lj45t8nrS0NLp3796qF+Huzp4922Veax1fX1/Gjx/fIC0xMRHr90NyVjHPr/kRk4L/ZhlZflccsRFBDq6pYzU+B12VPg8dOweuOkrtEOApIkOt0sYCqc3kB7gdWGnp82mJwny1o2ntsmrXSUyWhoFzNSa2ZRQ6t0Ka5iZcMuAopcqBlcAzIuJvaRa7Bvigqfwi0g1YCLzXKH2giEwXEW8R8RWRR4DewJYmitG0Vkk5+b/WWAUM79u1rgI1rb1cMuBY3Ad0A/KBj4F7lVKpIjJDRBpfxSwAzgAJjdK7A8uAYiAbuAy4XCmlf5Jq7bLlSAH7skv56fRI7rxoEB4Ca1PznF0tTXMLrtqHg1KqCHMgaZy+CQholPYx5qDUOG8qMMZeddS6FqUUL605SN+evjx22TB8vTwwAO9sOcYtcRGM6t/T2VXUNJfmylc4muZS1qfls+fEGX55yVB8vcwj0x6cM5Reft4s/SoVVxzxqWmuRAccR3rxRUho3OrXSEKCOZ+b+uyzzxqs3Pnee+91eBhpYmIiIkJBQUFHq9duJpPi5TUHGdTbn+tiw+vTe/h68cilMSRlFfNVyimn1U/T3IEOOI40aRIsXNh80ElIMO+fNMnmT33HHXfUL+Ps5eXF4MGDefjhhykvL7f5c1m78cYbycjIaHX+yMhIXn755QZp06ZNIycnh+DgYFtXr9W+3nuKg3ln+fXcaLw8Gn5sbpg4gFH9e/DCd+lUVNc6qYaa5vp0wHGkWbNgxYqmg05dsFmxwpzPDubMmUNOTg4ZGRk8++yzvP766/Xzplmrra21WfNQt27dCA0N7VAZ3t7e9OnTp8GVkyPVmhSvrDvEsD7duWp03/P2exiEpfNHklNSxRuJR51QQ01zDzrgOFpTQccBwQbAx8eHPn36MGDAABYtWsTixYv54osvWLp0KaNGjeK9995jyJAh+Pj4UF5eTklJCUuWLCE0NJTu3bszc+ZMkpKSGpT5/vvvExERgZ+fH1dddRV5eQ1HbDXVpPbf//6XKVOm0K1bN4KDg5k/fz5VVVXEx8eTlZXFI488Un81Bk03qa1cuZLRo0fj4+PDgAEDeO655xoEycjISJ599lnuueceevToQXh4OC+99FK7ztvm7FqyCit45NIYDIamg97EyF5cPbYfb27M4ERRRZN5NK2r0wHHGayDzlNPOSTYNKVbt27U1NQAcOzYMT766CM+/fRTUlJS8PHx4YYbbiA7O5tvvvmG3bt3c/HFFzN79mxycnIA2L59O3fccQdLlixhz549zJ8/n6eeeqrF51y9ejVXX301c+fOJTk5mYSEBGbOnInJZGLlypWEh4fz1FNPkZOTU/88jSUnJ3PDDTdw7bXXsm/fPl544QWef/55/vGPfzTI93//93+MHj2aXbt28dhjj/Hoo4+ydevWNp2jqhojXx6pYcLAQGYPa/lK7bdXDMMgwqOfpfBawhGSs4rb9Fya1ukppfTWaIuOjlbNOXDgQLP72uzJJ5UC8792dvvtt6srr7yy/vH27dtVcHCwWrhwoXr66aeVp6enys3Nrd///fffK39/f1VRUdGgnLFjx6o///nPSimlbr75ZjVnzpwG+3/2s58p89vK7F//+pfy9/evfzxt2jR14403NlvPiIgI9dJLLzVIS0hIUIA6ffq0UkqpRYsWqVmzZjXI8/TTT6v+/fs3KOemm25qkCcqKkr98Y9/bPa5lTr/7/v0l/tUxGPfqPe2HGvxuDqPf5aiIh77RkU+9o2K+f1/VVJmUauOc3UJCQnOroJL0Oeh5XMAJKkWvlv1FY6zJCTAsmXw5JPmfy80es0GVq9eTUBAAL6+vkydOpWLL76Yv//97wCEh4cTFhZWnzc5OZmKigpCQkIICAio3/bv38/Ro+Z+irS0NKZOndrgORo/bmz37t1ccsklHXodaWlpTJ8+vUHaRRddRHZ2NqWlpfVpY8Y0vAWrX79+5Ofnt/p5Nh8p4L0fswB4/ru0Vl2x9OlpXoBOUTftjfNG1mmaq3HZGz87tcZ9NrNmOaRZ7eKLL+att97Cy8uLfv364eXlVb/P39+/QV6TyURoaCibN28+r5wePRrPq+o6rAcWWL++un0mk6nVZX28/Xj9/2tqzXOmXWiSzouGhrAs8ShVtSYU5ok+q2tNeHvq33aapj8FjtbUAIGWRq/ZkJ+fH1FRUURERJz3ZdzYhAkTyM/Px2AwEBUV1WCrG3U2fPhwtm3b1uC4xo8bGz9+PN9//32z+729vTEajS2WMXz4cLZsaTgd3ubNmwkPD7fp7NYlleb+LQPg5WkgbvCFh2XHRgSx/O44Hrk0mhsmhrMh/TS3vbudMxXVNquXptlCcmYRf99w2KF9jfoKx5FaGo1mHXScMICgsTlz5hAXF8c111zDiy++yLBhw8jNzWX16tXMmTOHGTNm8OCDDzJt2jSef/55rr/+ehITE1m1alWL5T7xxBPMnz+fqKgoFi1ahFKKtWvXcs899+Dn50dkZCSbNm3illtuwcfHh969e59Xxm9+8xsmTZrE0qVLWbRoETt37uQvf/mLTVcbrTWa2JddQnx0CL3VGW6eM6nVSxDERgTV550+pDePfraXn7z+I+/cPpHBIV17LRXN8ZKzitmWUUjc4GD6Bfqy5UghX6dk88Mhc3Pvq3KY/7tpLFeP7W/3uugrHEfaubPlYFIXdHbudGy9miAifPbZZ8yePZu7776bmJgYFi5cyMGDB+nXrx8AcXFxvPPOOyxbtowxY8awcuVKli5d2mK5V1xxBatWreK7775j/PjxzJw5k4SEBAwG81vxmWee4cSJEwwZMoSQkJAmy5gwYQKffvopn3/+OaNGjeLxxx/n8ccf54EHHrDZ69994gwllTXcMHEAVw3xbvd6NwvG9+eju6dQUlnD/L9v5rHP9+rRa5rDJGcVc/Nb23hpzUGuX/YjU5/fwMOfprDj2P/eg0alePDjPdzzQRK7j9v3vemSK346W0xMjDp48GCT+9LS0hg+fLiDa+QcXXEBtrq/759Xp/P2xgx2PTWXXdu2dHjRrf/uzeH+j3ahAF8vg9st2qYXHjNzt/Pw8pqD/CPhSP3jWTEhPHrZMMrP1XLLO9upqTXh6WFg/ti+rDuQT0llDVMG9WLOiDCqa43EDe593vu0pXMgIm654qemOVVCej4TI4Po4dtyX1drHSv83xRC1a0cgKBpHVU3hsYg4O1p4IHZQxne1zzoZ/ldcfVNbbERQZSdq+WTHcd5PeEo248VAXD/zucJuGsBMTdf3fyTJCSYW2UeffSC9dFNaprWSPaZStJzz17wRs+2iBscXD9SzSDSqgEImtZRR/LLCPLz4qG50eddVcdGBHH/rKj6tAAfT+6aMZg7pkfWL4m8J2woA++9w2bzP+qAo2mNJKSb79WxZcCJjQjio7um4OftwfSoYH11o9ldVY2RHw6d5vLRfXlg9tBWv+emR/XGx8uAh0DykHEcX/aezeZ/1E1qmtZIQno+A3p1Y4iNR5TFRvZi2pBgMgrsO0O3pgFszSikotrI3BFhF85sJTYiqEFzW0xEEPSxGkEr0u75H3XAaQellNNmLtbsxzz9Bmw5WsCNEwfY5W88fmAQ69PyKamooaefbfqHNK0p6w/k4eftwdR2NN9aD+0HGty2EXn55fDdd+26fcNlm9REpJeIrBKRchHJEpFFzeRbKiI1IlJmtQ222j9ORJJFpMLy77iO1MvLy4vKysqOFKG5qJqaGqqMiqoaE/E2bE6zNn5AIAB7Tp6xS/maBuYfT+vT8rh4aEj96rQdNmsW3HsvkR98APfe2657BV024ACvAdVAGLAYWCYiI5vJ+x+lVIDVlgEgIt7Al8CHQBDwb+BLS3q7hIaGkp2dTUVFhV5SuBMxmUzk5eWxL78GXy9Du34Vtsbo8J6IYPf7HbSubV92CXml59rcnNYiy/yPmbfe2u75H12ySU1E/IHrgFFKqTJgs4h8BdwKPN6GouIxv8ZXLTOZ/k1EHgZmA6vbU7e6ecROnTpVP7V/Z1VVVYWvr6+zq+Ew/v7+/GtXIdOH9Lbdr8JGuvt6ER3and3H9RWOZj/rD+RhEJhlqyt1qz6bTBEif/rTTtWHEw3UKqUOWaWlADObyT9fRIqAHOAfSqlllvSRwF7V8FJkryW9QcARkSXAEoCQkBASExM7/CLcXVlZ2XmLp3Vmp8pMZBZWcnGYscHfv6yszKbvhzCvcyQdO0tCQoLb9AXa+hy4K3c5D6t2VhIVaGDvzh87XFbg7t2M+MMfOPD005wRMZ+DgAACf/c7RvzkJ+b08eNbV1hLaxc4awNmALmN0u4GEpvIOwLoB3gA0zAHnZst+54EPmmUfzmwtKXnb2k9nK6kq6398eYPR1TEY9+ok8UN1wCy9Xn4eHuWinjsG3U0/6xNy7WnrvZeaI47nIcTReUq4rFv1Js/HOl4YRs2KNW7t/lfiwbnoNF+3HQ9nDKg8Rz4PYCzjTMqpQ4opU4ppYxKqR+BvwLXt7UcTduQns+wPt3pH9jNrs8zbqBl4MAJ3aym2d76A+Zl3ueO6NPxwmw8/6OrBpxDgKeIDLVKGwuktuJYBfU3yqYCY6Rhu8WYVpajdSGlVTUkZRbbrs27BUNDu+Pv7aH7cTS7WJ+Wz5AQfwb19r9w5gt59NEL99HMmtWqaW3ARQOOUqocWAk8IyL+IjIduAb4oHFeEblGRILEbDLwIOaRaQCJgBF4UER8RKRuOuENdn8RmlvZdKiAWpOy6ewCzfEwCGMHBLL7hB6pptlWaVUN2zIKmWPL0Wk25JIBx+I+oBuQD3wM3KuUShWRGSJSZpXvJuAI5may94E/K6X+DaCUqgYWALcBZ4A7gQWWdE2rtyE9n57dvOrvk7G3cQMCSc85S2V1y4vNaVpb/HDwNLUmxdzhrhlwXHWUGkqpIszBonH6JiDA6vHNFyhnNxBr8wpqnYbJpPjhUD4zo0Pw9HDMb7DxA4OoNSn2nyphUmQvhzyn1vmtO5BHsL834we65lx9rnyFo2kOsS+7hIKyamYNa3rBN3sYVzfjgO7H0Wykxmgi4WA+s4eF4mFwzeH2OuBoXd7y7VkABPq1ewKKNgvp7kN4UDfdj6PZzM5jRZytqrVb/83y5RAZCbNnzyQy0vy4rXTA0bq05KxiPk06CcC9HyY7dPnn8QOD9BWOZjPr0vLw8TQwY2hvm5e9fDksWQJZWaCUkJVlftzWoKMDjtalfbUnm7ppKGosK3E6yrgBgZwqqSK3pMphz6l1Tkop1h3I46Ko3vh5275r/oknoKKiYVpFhTm9LXTA0bq002XnAPAQ8PI0OHQlzvH1N4DqZjWtY1btzuZkcSVDw+wzFdXx421Lb44OOFqXVVVjZMuRQqZH9eaheTHnLcFrbyP69sDLQ9itZxzQOiA5q5hHP9sLwL+2ZNqlWXjgwLalN0cHHK3LWncgj5LKGn4+c3CDtd0dxdfLgxH9euoZB7QO2ZZRSK3J3DBca7RPs/Bzz4GfX8M0Pz9zelvogKN1WSuSTtA/sBvThti+k7W1xg8IZN/JEmqNJqfVQXNvkweZ7+MS7NcsvHgxvPUWRESAiCIiwvx48eK2laMDjtYlZZ+pZPORAq6LDXfqPQvjBwZSWWPkYJ6eT1ZrHz9v89pNV4zua9dm4cWLITMTNmz4gczMtgcb0AFH66I+Tz6JUnBDbLhT6zF+gPnLQTerae1V12fz+OXDHN4s3FY64GhdjsmkWJF0gulRwQzo5XfhA+xoQK9u9PL31ksVaO2WlFlMWA/zjcSuTgccrcvZllHIyeJKFk4c4OyqICKMHxDI7uN6aLTWPslZxUyM6OUWq8fqgKN1OSuSTtDd15NLR9pggSobGDcgkKOnyymprHF2VTQ3k1NSSfaZSpdvSqujA47WpZRU1vDd/lyuGdcPXy8PZ1cHoH5m32e/OeDQqXU095eUaX6/TIzUAUfTXM5XKac4V2vixoltvGPNjpRlcp3Pkk+y+J/bdNDRWi05q5huXh4M79vD2VVpFR1wtC7l06QTDOvTnVH9XecDuvdkCWBeG93R87lp7i0pq4hxAwLxctA6Th3lHrXUNBtIyyll78kSFk4c4FIdrHGDg/G03Avk6eHY+dw091V2rpYDp0qZ5CbNaeDCAUdEeonIKhEpF5EsEVnUTL5HRGS/iJwVkWMi8kij/ZkiUikiZZZtrWNegeZqViSdwMtDWDC+v7Or0kBsRBCvLBwLwK1TI9ymA1hzrj3Hz2BSEOtGK8a6bMABXgOqgTBgMbBMREY2kU+A24Ag4DLgARG5qVGe+UqpAMs2z56V1mwnOauY1xKO2KRPY1tGAR/vOM6kiF708nfcQmutdfW4/gwO8Sc9R884oLVOUlYRIv+bddwduGTAERF/4DrgSaVUmVJqM/AVcGvjvEqpF5VSu5RStUqpg8CXwHTH1lizte0Zhdz45lb+svZghzvSv9uXw6K3t1NVYyIpq8hlO+XnjejDtoxCSir08GjtwpKziokJ604PXy9nV6XVRCl14VwOJiLjgS1KKT+rtIeBmUqp+S0cJ8Au4E2l1BuWtEygG+bguht4RCmV0sSxS4AlACEhIbErVqyw3QtyU2VlZQQE2GZ9jSPFRtKLjAzr5UFUUMvDkWtNiie3VJJTbn5vGoBrh3px1ZC2XZmUVSu+PlrNuqxa6qbGbE9ZtjwPLTlyxsiz26pYMsaHaf1sv4hWRzjqHLg6VzkPJqW4b30F0/p5cttIH4c+d0vnYNasWclKqYnNHeta7+r/CQBKG6WVAN0vcNxSzN8p/7JKW4w5CAnwS2CNiAxTSjWYS0Qp9RbwFkBMTIyKj49vb907jcTERGxxHnYcK+SFtdsxKYW3p7HFCQZrjCbuX76LnPIKPAyC0aQQEW6eM6lVfRvJWcVsPnya/LPn+CrlFOXnaokfFsKWI4XUGk14eRpaXVYdW52HC7nYpHgz9XtOqiDi42Pt/nxt4ahz4Opc5Tykniqhas1mrp42ingH90l25By4asApAxqPW+0BNNvALSIPYO7LmaGUOleXrpTaYpXteRG5HZgBfG276moteWfzsfr1OqprTWzLKGjyC7/GaOIXH+1m7YE8ls4fwejwQF5ak862jCJOteJu6uSsYm56ays1RvNzTYwI4k/XjiY6rDvJWcVsyygkbnCwy3bKGwzC3BFhfLE7m6oao8vcmKq5nrpmYVd9LzfHJftwgEOAp4gMtUobC6Q2lVlE7gQeBy5RSp28QNkK89WO5gBGk2LvyZL6E25SsOVIIWcqqhvkqzGaePDj3axOzeWpq0Zwx/RBxEYE8cHPpjBhYCC/W7mP44UV5z9BXbkmxZ9Xp9UHG4PArGGhRIeZL4pjI4KcsshaW80bEUZFtZEfjxY4uyqaC3OnCTutuWTAUUqVAyuBZ0TEX0SmA9cAHzTOKyKLgT8Bc5VSGY32DRSR6SLiLSK+liHTvYEtjcvR7GPdgVxySqr49dxoHp4XzaIpA9lxrIhLX93ID4dOA+ZVCn/1yR6+25/L768czp0XDao/3svDwF9vGg8Cv/hkNzVNLFRWXWvioRV72HGsGA+D4CHgbaeFqOxt6pBgAnw8WZua5+yqaC7MnSbstOaqTWoA9wHvAvlAIXCvUipVRGYA3yml6nqtngWCgZ1WJ/9DpdTPMff5LAOGAFXAHuBypZS+ldsBlFIs+yGDgb38uH9WVP1CZ4smD+TX/9nD7e/u4PJRfThWUE567ll+f+Vw7pox+LxyBvTy48/XjeG+5bt4ee1Bfnv58Pp9Z6tq+PmHyWw5Usgjl8YQN6gX244VuXTTWUt8PD2IjwlhfVoeRpNy6uJwmms6dcY8YeddMwZdOLOLcdmAo5QqAhY0kb4J86CCusfNnnWlVCowxi4V1C5o+7EiUk6c4Y8LRjX44hzVvydf/+IiHv40hW/25gDgaZD6SSybcgI6FtsAACAASURBVMXovtw8eSBv/pDB9CG9uTg6hLzSKm5/dwdH8sv4yw1juc6ymJo73QjXlLkjwvhmbw57ThQTG+Her0WzvSRL/81EN3xvuGSTmtY5vPHDUXoHeDe5qqavZcLBujiklLrgHGJPXTWC6LAAfvHxLh77PIUr/7aJE0UVvHvHpPpg0xnMGhaKl4foZjWtScmZRfh5ezC874UG7boeHXA0u0jLKSXx4GnumBbZ7GiruMHBeHsa8BDwakWfSzdvD+6Lj6Kkspb/7DxJQVk1S68eycXRIfZ4CU7Tw9eLuMHBrEnNxRXvk9OcKymrmHEDAvF0kwk7rblfjTW38NbGDPy8PbglLqLZPLERQSy/K46H5sW0eG+OtewzlfUj3gwC+WfPtZjfXc0b2YfMwgqO5Jc5uyqaCyk7V0taTikT3bB/EnTA0ezgZHEFX6Wc4ubJAwn0a/mO/rYOV44bHIyPl8GtR6K1xtzhYQCsPaCb1bT/cccJO6257KABzX29s/kYAvzsItuPoqm7KnL1mzg7qk9PX8YOCGRtai73z4pydnU0F+GOE3Zas/kVjoj8TUS+aSK9h4gsFZHhVmm/EpF9IqKvtDqJ4vJqPtlxgqvH9aNfoH1uSnOXmzg7at6IMFJOlpBbUuXsqmguIiE9n97+3hzOc8+mVpt+0YvIEODnmOc0a2wi8DRgPbXpm0AIcLst66E5zwfbsqisMXLPxUOcXRW3d+lIc7PaujTdrKbBzswiUk6WcLqs2m2XIrf1lcWvgBSlVFIT+8YD54ADdQlKqUrgfeBhG9dDc4KtRwtYlniU2IggYvq435BNVzMkJIB+PX15Z1OGW365aLb1xe7s+v+761LkrQo4IhIlIjUi8kyj9GWWlTYniogPcAvwURPHpwEvAz5AjYgoEfncsvsTYISITOvQK9GcKjmrmFvf2UFljZF9J0v0F6QN7Dp+hryz58gsrGDR2+75i1aznYpqI0CrbyNwRa0KOEqpI8A/gV+JSDCAiDwF3An8xHJFEwcEApuaKOI2IAPzDM1TLdtDln17MM8CfVn7X4bmbNsyCutnhDaa3PPXl6vZllFYfx/OuVoTa1NznVwjzZn2ZZcwun+PNt1G4Gra0qT2DOABPC4id2Huj7lVKbXesj8O80zMe5s4NgUIBzYopbZZtiwApZTJsj+una9BcwF1dz0L7vvry9XU3RhbNxvD1ymnKCzrnPcdaS07VlDOkfwyrp0Q7tYDZlo9LFoplSMirwK/sRz3oFLKelnMfkCpUqq6icNHAt6YF0JrymkgurV10VzP2apaAG6Ji2DB+P5u+4FwJdZDwHt28+KP3xzgp+/t5KO74wjw0Xc0dCXrLfdjzbHcn+Wu2vquPYy5H2azUuq1Rvt8MQ8KaMoEzFc/e5rZX4l5GWjNTW06XECgnxdLrx6pZzi2odiIoPrg3S/Ql7vfT2bJ+0n866eT8PHUC7R1FevS8hjWpzsDevk5uyod0uomNRG5BPMw5q3AdBFpPAtzIeY+nKaMB44qpRovG12nF6BXnHJTSik2Hy5g+pDeOtjY0exhYbx0/Rh+PFrIrz7Zg9Gk51nrCorKq0nKLGLeCPe+uoHWj1KbAKzCPHAgHjgOPN8oWzrgLSJNTds7Aqvh0E0YBBxsTV0013Mkv4zc0ipmDO3t7Kp0etdOCOf3Vw7nu/25/PzDZF5LOKxHr3VyG9LzMSmYO6KPs6vSYRcMOCISBXwHrAV+Yemj+QNwhYhcbJV1o+XfyU0UcwYYKyKXikhc3Ug3S/mBmPtvNjZxnOYGNh02X5xepAOOQ9w1YzA/Gd+fdQfyeGnNIW58cysf7zhOda15NdTkrGJeSziiA1Ensf5AHn16+DKqfw9nV6XDWuzDEZE+mANNGrDYMqIMzDdrPgq8AEwDUEplisgOYD7m5aGtPQW8A3yBua9nBrDZsu9KoBrzFZTmhjYdPs2g3v6EB7l3+7I7iQr1RzB3jNaaFL9duY8/fJ3KkBB/DuaWYTQpvDwM/P6q4ZZ1hwSDwOG8s+w/Vcr4AUGMDu+Jl4fgYRC8PAwcyCklNbuEqUN6Nxj0kZxVzDdHq+k+qFgPBnGwqhojGw+f5toJ/d1uOemmtBhwlFK5wHlr/iqljMDw849gGfBXEblfKVVhlX8/MKWZp7kF+LTxss8i0gtzkJqHuX/nt0qppm4qFcyB7y5L0j+Bx5XlBgYRGWcpZzjmwPkzpVRzgxe0NjpXa2RbRhE3TOw8C6C5g7jBvfHxOkJNrQlPDwO/vCSKgrIa/rvvVP39UNVGE099mdrk8e9vzWqh9ENEhfgzOCQAg0j9ctffZG5z2/s/3NWPRwuoqDZ2iuY0sP1s0R8CjwH3YZ5ZoEWWYDAb87Dpxl7DfOUTBowDvhWRFMuy0daWYF6KeizmH3zrgGPAGyLiDXwJvAq8DtwDfCkiQ5sZvq210a6sM1TWGLkoSjenOVJzs2ZfOaYvi9/eRrXRhKfBwO+uHEZUSHeMSvHVnmxW7spGYV5L6IrRfZkVE4rRpFiflse6A3nUDUOoNSmyCivIKiqvD2B106nogOM46w7kE+DjSdxg91yOoDGbBhylVK2I/BTzMOjW6APcYZnJoJ6I+APXAaOUUmXAZhH5CrgVeLxRGbcDf1FKnbQc+xfgbuANzAMcPIFXLVc8fxORhzEHudXteIlaI5uPnMbDIEwdom/0dDTrIdPWacvvbnr5hgAfT77dl0NNrQkvTwM/nT6ofv+Q0AA2Hj5dv+8vC8cRGxFEclYxN7+9jepaEwYRfUOvA5ksPwRmRod0niHwSimX2zAPo65olPYw8HUTeUuAKVaPJwJnLf//NfBdo/zfAL9popwlQBKQ5OfnpzBfLentAlufW19RYYv/7PR66K11m3e/YapH3A3Ku9+wVu/z7jdM9b//A9X3ztecXv+utHn3jVYRj32j/EfEO70ubdiSWvpud9V1aAKAxvfslABNTUEcYNlnnS/A0rfTeF+z5Sil3lJKTVRKTQwPD3d60HWFLSEhocX9RWXn8O0fzeN3LHB6XZ15HtxpO5edRsnWFZzLTmv1vnPZadwSG4Z3SASHckud/hq6ynvh2X99iYdBOLlzjdNfd2vPwYW4asApAxqPAeyBeZLPC+XtAZQp86tvSzlaG/14tBClYMbQEGdXRbOzaf088TQInyWfdHZVuox1B/KYHNnrgsu0uxNXDTiHAE8RGWqVNhZoashNqmVfU/lSgTHScDzhmGbK0dpo0+HTdPf1ZGx4T2dXRbOzHj7CrGGhrNydTa3RdOEDtA7JKiznUF4ZczrB7ALWXDLgKKXKMd/L84yI+IvIdOAa4IMmsr8PPCQi/UWkH+bJRd+z7EsEjMCDIuIjIg9Y0jfYs/5dgVKKTYcLmDYkGE8Pl3wbaTZ2Q2w4p8+e44dDp51dlU5vnWWyzs4wnY01V/6muA/zhJ75wMfAvUqpVBGZISLWC3q/iXmdnX3AfuBbSxrKPPR5Aeb1eM5gXr9ngdJDojvsWEE52WcquUg3p3UZs4aFEuzvzadJulnN3tZ3ksk6G3PZOc6VUkWYg0Xj9E2YBwPUPVaYZz14tJlydgOxdqpml7X5iHk6mxn6/psuw8vDwILx/Xl/ayZF5dX08u88fQuu5ExFNTszi7l35hBnV8XmXPkKR3NhGw8VMKBXNyKCO9cvMK1lN0wMp8ao+GJ3trOr0mm9u/kYRpNiQK/Ot2KLDjham9UYzXecXxQV0inmd9Jab1ifHozu31OPVrOT5Kxi/pFgvg/+6a9SO90ErDrgaG2WcuIMZedquVjPDt0l3TAx3DzR56nGt7hpHbX58Gnqljmqm0qoM9EBR2uzjYcLMAhMG6IDTld09dh+eHsY9OABO/D1Mk9hYxDw8jR0uqmEdMDR2mzz4dOMDg+kp5+Xs6uiOUGgnzdzR4Tx5Z7s+jV4NNvIOF2On5cHv5oT3Sln5tYBR2uTjYdOs/v4GYaG+ju7KpoTXT8xnOKKGr5Py3N2VTqNulm754wI48FLhna6YAM64GhtkJxVzM/+vRMFfLUnp9N1aGqtd/HQEMJ6+PBpJxs80NJqqXUL0dnrfb/reDGF5dXMG9m5bva05rL34WiuZ1tGIbVGc4+m0aTXRunKPAzCtRPCeSPxKC+uTueS4WFu/15Izipm8dvbOFdrwtND+OUlQxnU23zLX8bpMv624TC1RvstRLdmfy7eHgbiY0JtWq4r0QFHa7W4wcHUrWvcGTs0tbYZ0bc7CliWeJR3txxz+z6HbRkFVFn6pGqMipfXHmoynz0WolNKsfZAHtOjggnw6bxfy533lWk2N6JvDwzAxEG9ePSyYW795aJ13PGiSsC8CEpnWA20tLIWMP+m8vY08MK1YxjRrwcicOBUKY98lkKNUSF2WIjuYN5ZjhdVcG9855tdwJoOOFqr7TlxBqOCe2YOdusvFs024gYH42EQjCaFl4d7X/FmFZbzwbYsRvfvwaUj+zB1SO8G7/HoMPO8Zg99uI3jZ02Yw6ztrE3NQwQuGd55m9NADxrQ2iA5qwiACQN1sNHMy1k/edVwAH45x31HVRlNioc/TcFDhDdvncgDs5t+LbERQTw+xZfwXt345Sd7OFtVY7M6rEnNZcLAIEK7+9qsTFekA47Wajszi4kOC+hUC0JpHbNocgT+3h6cLK50dlXa7Z3NGezMLGbp1SPpF9jy/GXdPIVXbxxPTkkVT39pm2W1ThZXkHqqlEs78ei0OjrgaK1iNCl2ZRUzMbKXs6uiuRBvTwPTonqTePB0q5YYdjUHc8/y8ppDzBsRxrUT+rfqmNiIIH4xO4qVu7P5ck/HJzGtW/tm7og+HS7L1emAo7XKwdyznD1Xy6RI92w20ewnPiaE7DOVHD1dduHMLqS61sRDK/bQ3deTP107uk0T0T4wK4oJAwP5/Rf7OVlc0aF6rE3NIzosgEG9O//N1DrgaK2SZOm/mRihr3C0huruG0k86F4rgf5jw2FST5Xy3E9G0zvAp03HenoYePXG8SgFD/0nBaOpfVd3xeXV7MgsYl4XuLoBHXC0VkrKLKZPD1/CgzrfGh1ax/QP7MbQ0AC3Wnr6PzuP8/eEI8yM7s1lo9r3ZT8w2I9nrhnJjswi7vjXjnbNQPB9ej5Gk+rUswtYc7mAIyK9RGSViJSLSJaILGoh7yMisl9EzorIMRF5pNH+TBGpFJEyy7bW/q+gc0rKLCI2Mkivf6M1aWZ0CNsziqiornV2VS5ox7FCHv98H0rBtoyiDk1VE9HLD4PApsMF3PjmVra3cTmBtam59O3py+j+PdtdB3ficgEHeA2oBsKAxcAyERnZTF4BbgOCgMuAB0TkpkZ55iulAizbPHtVujPLPlPJqZIqJrnpsFfN/uJjQqk2mth61PXXb/l818n6u2hqjR1bc2bbsaL6/9eaFPd/tIu9J8+06tjKaiMbD59m3oiwLvNDzqUCjoj4A9cBTyqlypRSm4GvgFubyq+UelEptUspVauUOgh8CUx3XI27hqRMS/+NHqGmNWPSoCD8vD3coh+nrMp8FeZhgzVn4gYH4+1pMJflYb4J9iev/8iLq9OpqjG2eOzGw6epqjExb2TX6L8BEFcayigi44EtSik/q7SHgZlKqfkXOFaAXcCbSqk3LGmZQDfMgXU38IhSKqWZ45cASwBCQkJiV6xY0fEX1IIjxUbSi4wM6+VBVJCHXZ+rvcrKyggICOD91HP8eKqW1y7xw8PQNX6JWas7D11Za87Bq8lVZJeZePHibi77i10pxSMbKwn0EcaFeLT589fUebD+LPcNMPBJejWbsmvp5y9cGunF2WrV5PO8vfccu/Nr+dtsPzzd6HPV0nth1qxZyUqpic0d62pT2wQApY3SSoDurTh2KebA8i+rtMWYg5AAvwTWiMgwpdR517xKqbeAtwBiYmJUfHx8W+vepI0HT/Pt/lP07dmNAB9P8s+eIz2nlM1HCjAp8PY08vHdrjnpYWJiIvHx8bywZyOTBvtwyewpzq6SU9Sdh66sNefghG8WT36xn4hRkxgc4poBOi2nlII1m3joslEsmjKwzcc3dR7iG+W5ci4kHsznNytS+FdqNQAeUsv8sf0YHd6TkO4+9A7wJiUhiciQHgQNGeWSn//mdOTz4NCAIyKJwMxmdm8BfgH0aJTeAzh7gXIfwNyXM0Mpda4uXSm1xSrb8yJyOzAD+LptNW+f/+7N4b6PdjVI8/E04OtlqF+3vLrWxH92HnfZN1xJZQ0H885yxei+zq6K5uLio0MA8/BoVw046y03Wc6x85xl8TGh3BI3kL99fwQFGJXiq5Rsvmh0o2haTimL/2mf5Q5ckUP7cJRS8UopaWa7CDgEeIrIUKvDxgLNziEhIncCjwOXKKUutBqUwny1Y3dF5dU88cW++scGgQdnR5H+x8t4947J+HqZ230FWLkrm+/25TiiWm2263gxSsFEfcOndgEDevkxOMSfRBceHr0uLY9xAwIJ7WH/Ocsujg7Fx/I59/Uy8Ok9U9nz1FzW/fpibpwYXrfSR/1M212BSw0aUEqVAyuBZ0TEX0SmA9cAHzSVX0QWA38C5iqlMhrtGygi00XEW0R8LUOme2O+krKrymojd763k7JztXh7mN9w3p4GZsaEIiLERgSx/K44HpoXw79/OpmxAwK5/6NdfLzjuL2r1mZJmUV4GIRxAwKdXRXNDcRHh7Ito5DK6pY7zJ0ht6SKvSdLmDvCMfe8WH/Ol98VR2xkLwL9vBka1p2FkwbWB6OutLaUq/XhANwHvAvkA4XAvUqpVAARmQF8p5Squ15/FggGdlp1Un6olPo55n6fZcAQoArYA1yulLLrT4lao4lffLyblJNnWLY4lpDuPmzLKCRucHCDS+bYiKD6xxMHBXHf8l38duU+iiuquXfmEJfpdN2ZWcyofj3w83bFt4rmauJjQnh3yzG2HStkloutXLkuzdycNs9BAQcafs4bpy+/K67J74bOzOW+RZRSRcCCZvZtwjywoO7xoBbKSQXG2LyCLVBK8eSXqaxPy+OZa0bW38F8oTeTn7cnb982kYc/TeHF1QdJzyklpk934gb3duobsdakSDlxhlviIpxWB829TB7UC18vAz8cPO16AedAHhHBfkSFukb/UnPBqDNzuYDjzv6x4Qgf7zjOffFDuG1qZJuO9fIw8H8Lx1Fda+KrlBwkJQcfryNO7UzMLDVxrtakJ+zUWs3Xy4Opg4NJPJgPNHe/tuOdraph69ECbp8a6TKtB12RS/XhuKvkrGLu/TCZv6w7xLXj+/PIpTHtKsdgEEb1Nw/Sc4XOxMPF5vXdY/WEnVobxMeEkllYQWZBubOrUm/joQJqjMph/Tda0/QVTgclZxVz81vbqDaaMAjcOGlAh35BxQ3ujY/nEc7VmuyydnpbHCo2Mqi3PyHd2zaTrta1xceYh0f/cOg0kS4y5f66A7kE+Xl1uSYsV6OvcDooIT2faqP5SkCApA5MBAjmdt2P7o4jopcfgd28nDY6TCnF4WKj/oBqbRYR7E/fnr78+8fMDk2MaSs1RhMb0vOZPSwMTw/9ledM+ux3gFKKnZnmJi+DDYc3xkYE8dsrhlFQXs2G9PwOl9ceR0+XU1aD7r/R2iw5q5j8s+fIKChn8dvbnB50dh4rorSqlrkjXGsQQ1ekA04TSs6pVn1IViSdYPuxYm6bGsFv6sba2+iKYM7wMMJ6+PDhtiyblNdWyVl6wk6tfbZlFNYvN33OBW5qXJeWh7engRlDQ5xaD00HnCYVn1MsusAvs4zTZSz96gDThgSzdP5I7p8VZdPmJ08PAzdNGsjGw6c5XtixJWzbY/X+XLwNcKa82uHPrbm3uhmU6zjzKlkpxboDeVwU1Rt/H91l7Ww64DTjXK2JL3ZnN7mvxmjiV//Zg7engVcWjsNgp5leb548EIMIy3c49ionObOIxIOnqTbB4ne2O71JRHMvdTc1LhjXDwUUlDnvR0t67llOFlfq0WkuQgecZgjw8c7jrNh54rx9r64/xN6TJbxw7Wj69LTfnEx9evoyZ3gonyad5Fyt46YK+XpvTv0CVc4emq25p9iIIP6ycBwRwX68vSnjwgfYyTrLZJ2X2HmyTq11dMBpQpCP8O5PJxE3KJhHP9/LE6v2UV1rHom2PaOQ1xOPcuPEAVzugBmUb4mLoKi8mu/25dr9uerULRNsoGvN86TZlodB+NlFg9h9/Ex9n6Cjra+brLO7/Sfr1C5MB5wm9PQRZsWE8t5PJ3HPzMEs336cm9/exld7srn7/STCuvvw1PwRDqnL9CG9iQz2c+jggb0nSxjetzvXDvXqMtOma/ZxfWw4Pbt58c9Nxxz+3OtS89h7soQR/RqveKI5iw44LfD0MPDby4fz2qIJ7M8u4cFP9lBaVUtRRQ3puS0u0WMzBoOweEoESVnFpOc2XpvO9o4VlJOee5YbYgdw1RBvHWy0DvHz9mTxlIGsSc116OCX5Kxi7l2eDMDnySd1P6SL0AGnFa4c05dFk/+3OqDR6Nh+jetjw/H2NDjkKmdNqrnp7tJRXWeddc2+bp8WiYdBeHeL465ytmUUUmtZ5bDWwZ9XrXk64LTSVWP71S+a5uh+jSB/b64a05dVu7IpO1dr1+davT+XMeE96R/Yza7Po3UdYT18mT+2HyuSTlBSUeOQ56yboUPQ/ZCuRAecVjpvMSUHNzXdEhdBebWx2aHatpBTUsmeE2e4dKS+utFs666LBlNRbeQjBy0yWPfDbOHEcN0P6UL0nVBt4Mz1K8YPCGRE3x68vTGDkspqu6yVs2a/uTntct2cptnYiH49mB4VzHs/HuNnFw1qcGOoPWxIy6e7jyd/XDDa7s+ltZ7+S7gJEeHiob3JKqrgL2sPsfiftp+janVqLtFhAQwOcY0FqrTO5a4Zg8krPce3+07Z9XlMJsWGg/lcHBOig42Lcbm/hoj0EpFVIlIuIlkisqiFvEtFpEZEyqy2wVb7x4lIsohUWP4d55hXYR++3h4AmJTtb8gsLDvHjmNFXKab0zQ7mTk0hKjQAP66/jCvJRy228ixfdklnD57jkuG6Zs9XY3LBRzgNaAaCAMWA8tEpKWlA/+jlAqw2jIARMQb+BL4EAgC/g18aUl3SzOGhuDlYZ5Gx2Cw7Vo569PyMCk9Ok2zH4NBmDvcvDibva7SAb5Pz8cg5oXgNNfiUgFHRPyB64AnlVJlSqnNwFfAre0oLh5zH9WrSqlzSqm/YR60MttW9XU088CFKfTy96Z3gA9jwnvarOzV+3MZ0KsbI/rqm+Q0+/H1st9Vep0N6XlMGBhEL3+3/W3ZabnaoIFooFYpdcgqLQWY2cIx80WkCMgB/qGUWmZJHwnsVXXzpJvttaSvblyIiCwBlgCEhISQmJjY7hdhb7fFCK/uquIPH37PnAivDpdXUaPYeKiCuRGe/PDDD/XpZWVlLn0eHEWfB9udA/+zRgxiDjgGAZ8zWSQmnux4BS2Kq0zsz67k+mgvu/zN9HuhY+fA1QJOAND4dvoSoHsz+VcAbwF5wBTgcxE5o5T62FJWSWvLUkq9ZSmLmJgYFR8f3576O8RMpdhRsp1vs0p59Mbp9PDtWND5ck82RrWHuy+f3GDkW2JiIq58HhxFnwfbnYN4IHRQNg9+soebJkdw14JRHS7T2kfbjwP7WHLlVKLDmvvaaD/9XujYOXBok5qIJIqIambbDJQBjdt0egBNziOjlDqglDqllDIqpX4E/gpcb9ndprLciYjwuyuGc6ayhtcTjna4vNX7cwnt7sN4Jy1nrXUtV4/rz6TIILYcKaBhA0THfZ+WR3hQN4aG6pGWrsihAUcpFa+Ukma2i4BDgKeIDLU6bCyQ2tqnwNxPg+WYMSJivVjNmDaU5dJG9e/JT8b3590txzhZ3P45qiqrjSQePM2lI/vYbV0fTWts4cQBZBSUszPTdoMGKquNbD5SwJzhYTT82GuuwqUGDSilyoGVwDMi4i8i04FrgA+ayi8i14hIkJhNBh7EPDINIBEwAg+KiI+IPGBJ32DXF+FAD8+LQYCX1xxsdxkbD5+mssbIZXp0muZAV47pS4CPJ5/stN3MA1szCjhXa2K2Hg7tslwq4FjcB3QD8oGPgXuVUqkAIjJDRMqs8t4EHMHcTPY+8Gel1L8BlFLVwALgNuAMcCewwJLeKfQL7MZdMwbxxZ5T7D15pl1lrN6fS6CfF5MH9bJx7TSteX7enlw9rh//3ZdDaZVt5ldbn5aPv7cHUwbr97KrcrmAo5QqUkotUEr5K6UGKqU+stq3SSkVYPX4ZqVUsOX+m2GWoc/WZe1WSsUqpboppSYopXY78rU4ws9nDiHY35vnvk1rc3v49oxCvt2Xw4QBgXh5uNxbQevkbpw4gKoaE1/t6fjMA0opNqTlM2NoCD6eHjaonWYP+lvGzXX39eJXc6PZfqyIh1aktPpGuuSsYm55ZzvVtSY2HSnQ64VoDjcmvCfD+nTnP00s495WqadKyS2tYrZeStql6YDTCcSEBSDAqt3Z3PjmVn48UtBi/gOnSvnNij3UGM1XRCaT0uuFaA4nItw4aQD7sktIPdX4Doa22ZCejwjM0rMLuDQdcDqBnZnF1A3KqTUpfvreTl5cnU5+aVWDfDkllTz8aQpX/n0Tp8+ew9MgTlnfR9Pq/GR8f7w9Dazo4FXO9+n5jA0PJKS7j41qptmDq934qbVD3OBgvD0N1NSa8PAwMG5AIMt+OMrbmzK4emx/JkYG8eWebHZlFQPC3TMGc398FEdOl7Eto5C4wcF6vRDNKQL9vLlsZB9W7c7mt1cMr5/6pi3yz1aRcuIMv5kbbYcaarakA04nULc4nHXwyCos593Nx/h4xwk+32WeOsQg8Nqi8Vw+um/9cTrQaM5246QBfJVyijWpuVwzrn+bj09MPw3AJcPDbF01zcZ0wOkkGgePiGB/foAA3AAACwVJREFU/nDNKHp08+IfG47U3xGbUVDutDpqWlOmDg5mQK9ufLLjRLsCzmfJJ+ju40lltX2XX9c6TvfhdHLxMaH4eBl0X43msgwGYWHsALZmFJJV2LYfRFuPFrAjs5iz52pZ/M52PdrSxemA08nVNbc9NC9Gr+2uuazrJ4YjwGOf721T0Hh57f8mlrfXcgea7eiA0wXERgRx/6woHWw0l3XqTBUisC2jqNULs320/TjJWcV46NGWbkP34Wia5nTbMgqpmyijqsbEpsOnW/yB9OPRAp76cj8zo0O4f9YQdmYW69GWbkAHHE3TnC5ucDA+XgbO1ZhQwDd7c7g1LoLggPPvqzlWUM69H+4isrc/f180nh6+XkwepK9s3IFuUtM0zenq+hofvjSGxy6L4URRBde/sZXjhQ2X3iiprOFn/96JQeCd2yd2ePFBzbF0wNE0zSXU9TXeGx/FR3dPobiimmuXbWHfSfO0N7VGEw98tIsTRRW8cUssEcH+Tq6x1lY64Gia5nJiI3rx2c+n4ePpwU1vbeWdzRksfHMrmw4X8NyC0UzRgwPckg44mqa5pKjQAFbeN43eAT788Zs0dh0/g4dBGKKXj3ZbOuBomuaywnr4cs34fv9LUHpmc3emA46maS5tZnQovnq2jE7B5QKOiPQSkVUiUi4iWSKyqIW834lImdVWLSL7rPZnikil1f61jnkVmqbZip4to/NwxftwXgOqgTBgHPCtiKQopVIbZ1RKXW79WEQSgQ2Nss1XSq23U101TXMAPbN55+BSVzgi4g9cBzyplCpTSm0GvgJubcWxkcAM4H171lHTNE1rH5cKOEA0UKuUOmSVlgKMbMWxtwGblFKZjdKXi8hpEVkrImNtVE9N0zStjVytSS0AKG2UVgJ0b8WxtwHPNkpbDOzCvBTML4E1IjJMKXWm8cEisgRYAhASEkJiYmLbat4JlZWV6fOAPg+gz0EdfR46eA6UUg7bgERANbNtBsYDFY2O+Q3w9QXKvQgoAwIukC8dc59Oi/WMjo5WmlIJCQnOroJL0OdBn4M6+jy0fA6AJNXCd6tDr3CUUvEt7bf04XiKyFCl1GFL8ljgvAEDjdwOrFRKlV2oCpivdjRN0zQHc6k+HKVUObASeEZE/EVkOnAN8EFzx4hIN2Ah8F6j9IEiMl1EvEXEV0QeAXoDW/6/vfsNkeuqwzj+fezWpGYTQ9IaocUsqU1ttqQmVgQ1GGgkti9EzZuyUQzYBlsDYt+0oJKthhZB8IW01eCW1kRSpY0RQXwhNGIKkQZNrdumKREarVFpzJ/tpln/8PPFOWPGcXebnc2cO7P3+cCFvfdMht99MtzfzN2zZzp2AmZmNqWuajjZ3cAVwN+APcBdkadES1onqfVTzCeA08DTLccXAo8Ap4BXgY8Bt0aE/0zZzKwC3TZpgIj4O6mJTDb2K9LEguZje0iNqfWxo8DqTtRoZmYz142fcMzMbA5ywzEzsyLccMzMrAg3HDMzK8INx8zMinDDMTOzItxwzMysCDccMzMrwg3HzMyKcMMxM7Mi3HDMzKwINxwzMyvCDcfMzIpwwzEzsyLccMzMrAg3HDMzK8INx8zMinDDMTOzIrqu4UjaJumQpAlJj13E478k6S+Szkp6VNK8prEBSU9LOifpiKQNHS3ezMym1HUNB/gzsAN49M0eKGkjcB9wC7AcWAHc3/SQPcBvgaXAl4EnJV11qQs2M7M313UNJyL2RsQ+4ORFPPyzwEhEjEbEKeDrwBYASSuBtcD2iHgjIp4Cngc2daZyMzObTl/VBczSIPCTpv3ngGWSluaxP0TEWMv44GRPJGkrsDXvTkj6fQfq7TVXAq9VXUQXcA7OoME5TJ/B8un+Ya83nH7gTNN+4+eFk4w1xq+e7IkiYiewE0DSoYi4+dKW2nucQ+IcnEGDc5hdBkVvqUnaLymm2A608ZSvA4ua9hs/j00y1hgfw8zMiivacCJifURoiu3DbTzlKHBT0/5NwF8j4mQeWyFpYcv4aPtnYGZm7eq6SQOS+iTNBy4DLpM0X9JUt/6+D3xO0ipJi4GvAI8BRMRR4DCwPT/HJ4HVwFMXUcbO2Z7HHOEcEufgDBqcwywyUERcykJmTdIwsL3l8P0RMSzpXcALwKqIOJ4ffw9wL3AFqZl8PiIm8tgAqQF9ADgOfCEiftH5szAzs1Zd13DMzGxu6rpbamZmNje54ZiZWRFuOE0kLZH0Y0njkl6RNFR1TSVMt36dpFvyOnTn8rp00/5hV6+SNE/SSP5/H5N0WNKtTeN1yWG3pBN5bcKjku5oGqtFBs0kXSfpvKTdTceG8utkXNI+SUuqrLGT8p+ynJf0et5eahqbcQ5uOP/rIeAfwDJgM/CIpElXJphjJl2/TtKVwF7gq8AS4BDww+LVldEH/BH4CPB20ozHH+UFYOuUw4PAQEQsAj4O7JD0vppl0Owh4NnGTr4efBf4DOk6cQ54uJrSitkWEf15ux7az8GTBjJJC4BTwI15SjWSdgGvRsR9lRZXiKQdwDURsSXvbwW2RMQH8/4C0pIWayLiSGWFFiLpd6TFYJdSwxwkXQ/sB74ILKZmGUi6HfgUaWbsuyPi05IeIDXkofyYa4EXgaUty2jNCZL2A7sj4nstx9vKwZ9wLlgJ/KvRbLIp116riUFSBgBExDhwjBpkImkZ6TUxSs1ykPSwpHPAEeAE8DPql8Ei4GvAPS1DrTkcI90VWVmuuuIelPSapGckrc/H2srBDeeCfuBsy7EzpHXZ6mqq9ejmdCaSLgd+ADye373XKoeIuJt0butIt9EmqFkGpJXnRyLiTy3H65bDvaSvfbma9AefP82fZtrKwQ3nAq+99v9ql4mktwC7SO/WtuXDtcshIv4dEQeAa4C7qFEGkt4LbAC+NclwbXIAiIhfR8RYRExExOPAM8BttJlDr68WfSkdBfokXRcRL+djdV97bZT0nUPAf+/bX8sczUSSgBHSL0Fvi4h/5qFa5dCijwvnWpcM1gMDwPH0kqCftMzWKuDnNK3fKGkFMI90/aiDAETLOpYXnUNEeMsb8ATpW0IXAB8ifUQcrLquAufdB8wnzVDalX/uA67KGWzKx74BHKy63g7m8B3gINDfcrwWOQDvAG4nX2CBjcA4abZaLTLIObwNeGfT9k3gyZzBIOnW+7p8ndgNPFF1zR3KYXF+DTSuB5vz62FluzlUflLdtJGme+7LoR4HhqquqdB5D+d3Ls3bcB7bQPrl8RukGUsDVdfboQyW5/M+T7pd0Ng21yWHfEH9JXA6X0yeB+5sGp/zGUyRyzBpplZjfyhfH8ZJXwC5pOoaO/h6eJZ0m+x0fjP20dnk4GnRZmZWhCcNmJlZEW44ZmZWhBuOmZkV4YZjZmZFuOGYmVkRbjhmZlaEG46ZmRXhhmPWIyQtkjQs6YaqazFrhxuOWe+4GdgOXF51IWbtcMMx6x1rSF8V8ELVhZi1w0vbmPUASS8C72k5vDciNlVRj1k73HDMeoCk95NWMx8FHsiHT0TEK9VVZTYz/j4cs97wHOnL0L4dEQerLsasHf4djllvGATeCvym6kLM2uWGY9Yb1pK+r+dw1YWYtcsNx6w3rAGORcTZqgsxa5cbjllvWIWnQ1uP86QBs95wGlgraSNwBng5Ik5WXJPZjHhatFkPkHQjMAKsBuYD6yLiQLVVmc2MG46ZmRXh3+GYmVkRbjhmZlaEG46ZmRXhhmNmZkW44ZiZWRFuOGZmVoQbjpmZFeGGY2ZmRfwH/x/QFC6b8qsAAAAASUVORK5CYII=\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" } } ], "source": [ "y_pred = model.predict(X_valid)\n", "plot_series(X_valid[0, :, 0], y_valid[0, 0], y_pred[0, 0])\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": { "id": "BaS5Yqbx5jk7" }, "source": [ "## Using a Simple RNN" ] }, { "cell_type": "markdown", "source": [ "Let’s see if we can beat that with a simple RNN. It just contains a single layer, with a single neuron. We do not need to specify the length of the input sequences (unlike in the previous model), **since a recurrent neural network can process any number of time steps** (this is why we set the first input dimension to None). By default, the SimpleRNN layer **uses the hyperbolic tangent activation function**. It works exactly as we saw earlier: the initial state $h_{init}$ is set to 0, and it is passed to a single recurrent neuron, along with the value of the first time step, $x_0$. The neuron computes a weighted sum of these values and applies the hyperbolic tangent activation function to the result, and this gives the first output, $y_0$. In a simple RNN, this output is also the new state $h_0$. This new state is passed to the same recurrent neuron along with the next input value, $x_1$, and the process is repeated until the last time step. Then the layer just outputs the last value, $y_{49}$. All of this is performed simultaneously for every time series." ], "metadata": { "id": "4H7cLxnfAVEW" } }, { "cell_type": "code", "execution_count": 13, "metadata": { "id": "LlXfar6n5jk7" }, "outputs": [], "source": [ "np.random.seed(42)\n", "tf.random.set_seed(42)\n", "\n", "model = keras.models.Sequential([\n", " keras.layers.SimpleRNN(1, input_shape=[None, 1])\n", "])" ] }, { "cell_type": "markdown", "source": [ "Note that for each neuron, a linear model has one parameter per input and per time step, plus a bias term (in the simple linear model we used, that’s a total of 51 parameters). In contrast, for each recurrent neuron in a simple\n", "RNN, **there is just one parameter per input and per hidden state dimension** (in a simple RNN, that’s just the number of recurrent neurons in the layer), plus a bias term." ], "metadata": { "id": "h11FMMnGCCeg" } }, { "cell_type": "code", "source": [ "model.summary()" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "E082xqnTBttm", "outputId": "47389102-2276-41ab-c5fd-173c8a079662" }, "execution_count": 14, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "Model: \"sequential_1\"\n", "_________________________________________________________________\n", " Layer (type) Output Shape Param # \n", "=================================================================\n", " simple_rnn (SimpleRNN) (None, 1) 3 \n", " \n", "=================================================================\n", "Total params: 3\n", "Trainable params: 3\n", "Non-trainable params: 0\n", "_________________________________________________________________\n" ] } ] }, { "cell_type": "code", "source": [ "optimizer = keras.optimizers.Adam(learning_rate=0.005)\n", "model.compile(loss=\"mse\", optimizer=optimizer)\n", "history = model.fit(X_train, y_train, epochs=20,\n", " validation_data=(X_valid, y_valid))" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "DFkksB7EBsjd", "outputId": "4e89dcd8-f38c-4e73-86eb-5bb18bf7ca58" }, "execution_count": 15, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "Epoch 1/20\n", "219/219 [==============================] - 13s 54ms/step - loss: 0.0967 - val_loss: 0.0489\n", "Epoch 2/20\n", "219/219 [==============================] - 11s 52ms/step - loss: 0.0369 - val_loss: 0.0296\n", "Epoch 3/20\n", "219/219 [==============================] - 11s 52ms/step - loss: 0.0253 - val_loss: 0.0218\n", "Epoch 4/20\n", "219/219 [==============================] - 18s 80ms/step - loss: 0.0198 - val_loss: 0.0177\n", "Epoch 5/20\n", "219/219 [==============================] - 15s 68ms/step - loss: 0.0166 - val_loss: 0.0151\n", "Epoch 6/20\n", "219/219 [==============================] - 12s 53ms/step - loss: 0.0146 - val_loss: 0.0134\n", "Epoch 7/20\n", "219/219 [==============================] - 11s 51ms/step - loss: 0.0132 - val_loss: 0.0123\n", "Epoch 8/20\n", "219/219 [==============================] - 12s 53ms/step - loss: 0.0124 - val_loss: 0.0116\n", "Epoch 9/20\n", "219/219 [==============================] - 12s 53ms/step - loss: 0.0118 - val_loss: 0.0112\n", "Epoch 10/20\n", "219/219 [==============================] - 12s 53ms/step - loss: 0.0116 - val_loss: 0.0110\n", "Epoch 11/20\n", "219/219 [==============================] - 11s 52ms/step - loss: 0.0114 - val_loss: 0.0109\n", "Epoch 12/20\n", "219/219 [==============================] - 11s 52ms/step - loss: 0.0114 - val_loss: 0.0109\n", "Epoch 13/20\n", "219/219 [==============================] - 12s 56ms/step - loss: 0.0114 - val_loss: 0.0109\n", "Epoch 14/20\n", "219/219 [==============================] - 12s 53ms/step - loss: 0.0114 - val_loss: 0.0109\n", "Epoch 15/20\n", "219/219 [==============================] - 12s 53ms/step - loss: 0.0114 - val_loss: 0.0109\n", "Epoch 16/20\n", "219/219 [==============================] - 12s 53ms/step - loss: 0.0114 - val_loss: 0.0109\n", "Epoch 17/20\n", "219/219 [==============================] - 12s 53ms/step - loss: 0.0114 - val_loss: 0.0109\n", "Epoch 18/20\n", "219/219 [==============================] - 12s 53ms/step - loss: 0.0114 - val_loss: 0.0109\n", "Epoch 19/20\n", "219/219 [==============================] - 11s 52ms/step - loss: 0.0114 - val_loss: 0.0109\n", "Epoch 20/20\n", "219/219 [==============================] - 11s 52ms/step - loss: 0.0114 - val_loss: 0.0109\n" ] } ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "id": "xp6oMcGi5jk8", "outputId": "345df898-d649-4d5b-88f4-465dcea37f37", "colab": { "base_uri": "https://localhost:8080/" } }, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "63/63 [==============================] - 0s 8ms/step - loss: 0.0109\n" ] }, { "output_type": "execute_result", "data": { "text/plain": [ "0.010881561785936356" ] }, "metadata": {}, "execution_count": 16 } ], "source": [ "model.evaluate(X_valid, y_valid)" ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "id": "ArbK1I4Z5jk8", "outputId": "54db10c9-856c-4d19-f5d0-dcc062d70344", "colab": { "base_uri": "https://localhost:8080/", "height": 291 } }, "outputs": [ { "output_type": "display_data", "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" } } ], "source": [ "plot_learning_curves(history.history[\"loss\"], history.history[\"val_loss\"])\n", "plt.show()" ] }, { "cell_type": "code", "execution_count": 18, "metadata": { "id": "-r6pZD_05jk8", "outputId": "4be1cf45-3e15-4119-e0dc-b52eb6601d20", "colab": { "base_uri": "https://localhost:8080/", "height": 293 } }, "outputs": [ { "output_type": "display_data", "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" } } ], "source": [ "y_pred = model.predict(X_valid)\n", "plot_series(X_valid[0, :, 0], y_valid[0, 0], y_pred[0, 0])\n", "plt.show()" ] }, { "cell_type": "markdown", "source": [ "There are many other models to forecast time series, such as weighted moving average models or autoregressive integrated moving average (ARIMA) models. Some of them require you to first remove the trend and seasonality. Once the model is\n", "trained and starts making predictions, you would have to add them back. When using RNNs, it is generally not necessary to do all this, but it may improve performance in some cases, since the model will not have to learn the trend or the seasonality. " ], "metadata": { "id": "AqHSj9gsDGCv" } }, { "cell_type": "markdown", "metadata": { "id": "xWFI7MMA5jk8" }, "source": [ "## Deep RNNs" ] }, { "cell_type": "markdown", "source": [ "Implementing a deep RNN with `tf.keras` is quite simple: just stack recurrent layers. In this example, we use three SimpleRNN layers. Make sure to set `return_sequences=True` for all recurrent layers (except the last one, if you only care about the last output). **If you don’t, they will output a 2D array (containing only the output of the last time step)** instead of a 3D array (containing outputs for all time steps), and the next recurrent layer will complain that you are not feeding it sequences in the expected 3D format" ], "metadata": { "id": "5Za9vyD9DRtu" } }, { "cell_type": "code", "execution_count": 19, "metadata": { "id": "S63hdy8B5jk8", "outputId": "7b254938-a536-4839-e0da-25876725fb38", "colab": { "base_uri": "https://localhost:8080/" } }, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "Model: \"sequential_2\"\n", "_________________________________________________________________\n", " Layer (type) Output Shape Param # \n", "=================================================================\n", " simple_rnn_1 (SimpleRNN) (None, None, 20) 440 \n", " \n", " simple_rnn_2 (SimpleRNN) (None, None, 20) 820 \n", " \n", " simple_rnn_3 (SimpleRNN) (None, 1) 22 \n", " \n", "=================================================================\n", "Total params: 1,282\n", "Trainable params: 1,282\n", "Non-trainable params: 0\n", "_________________________________________________________________\n" ] } ], "source": [ "np.random.seed(42)\n", "tf.random.set_seed(42)\n", "\n", "# By default, recurrent layers in Keras only return the final output. \n", "# To make them return one output per time step, you must set return_sequences=True\n", "# number of parameters https://d2l.ai/chapter_recurrent-neural-networks/rnn.html#recurrent-neural-networks-with-hidden-states\n", "model = keras.models.Sequential([\n", " keras.layers.SimpleRNN(20, return_sequences=True, input_shape=[None, 1]), #1*20+20+20*20\n", " keras.layers.SimpleRNN(20, return_sequences=True), #20*20++20+20*20\n", " keras.layers.SimpleRNN(1) #20*1+1+1*1\n", "])\n", "\n", "model.summary()" ] }, { "cell_type": "code", "source": [ "model.compile(loss=\"mse\", optimizer=\"adam\")\n", "history = model.fit(X_train, y_train, epochs=20,\n", " validation_data=(X_valid, y_valid))" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "SfAv-77vDlcm", "outputId": "7138b70e-7342-4ca4-8549-28f58f537b8d" }, "execution_count": 20, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "Epoch 1/20\n", "219/219 [==============================] - 41s 178ms/step - loss: 0.0492 - val_loss: 0.0090\n", "Epoch 2/20\n", "219/219 [==============================] - 38s 173ms/step - loss: 0.0070 - val_loss: 0.0065\n", "Epoch 3/20\n", "219/219 [==============================] - 47s 217ms/step - loss: 0.0053 - val_loss: 0.0045\n", "Epoch 4/20\n", "219/219 [==============================] - 62s 282ms/step - loss: 0.0045 - val_loss: 0.0040\n", "Epoch 5/20\n", "219/219 [==============================] - 54s 246ms/step - loss: 0.0042 - val_loss: 0.0040\n", "Epoch 6/20\n", "219/219 [==============================] - 49s 226ms/step - loss: 0.0038 - val_loss: 0.0036\n", "Epoch 7/20\n", "219/219 [==============================] - 37s 171ms/step - loss: 0.0038 - val_loss: 0.0040\n", "Epoch 8/20\n", "219/219 [==============================] - 38s 172ms/step - loss: 0.0037 - val_loss: 0.0033\n", "Epoch 9/20\n", "219/219 [==============================] - 54s 245ms/step - loss: 0.0036 - val_loss: 0.0032\n", "Epoch 10/20\n", "219/219 [==============================] - 37s 169ms/step - loss: 0.0035 - val_loss: 0.0031\n", "Epoch 11/20\n", "219/219 [==============================] - 37s 170ms/step - loss: 0.0034 - val_loss: 0.0030\n", "Epoch 12/20\n", "219/219 [==============================] - 37s 168ms/step - loss: 0.0033 - val_loss: 0.0031\n", "Epoch 13/20\n", "219/219 [==============================] - 37s 170ms/step - loss: 0.0034 - val_loss: 0.0031\n", "Epoch 14/20\n", "219/219 [==============================] - 38s 173ms/step - loss: 0.0033 - val_loss: 0.0032\n", "Epoch 15/20\n", "219/219 [==============================] - 38s 174ms/step - loss: 0.0034 - val_loss: 0.0033\n", "Epoch 16/20\n", "219/219 [==============================] - 38s 172ms/step - loss: 0.0035 - val_loss: 0.0030\n", "Epoch 17/20\n", "219/219 [==============================] - 38s 173ms/step - loss: 0.0033 - val_loss: 0.0029\n", "Epoch 18/20\n", "219/219 [==============================] - 37s 170ms/step - loss: 0.0033 - val_loss: 0.0030\n", "Epoch 19/20\n", "219/219 [==============================] - 37s 170ms/step - loss: 0.0032 - val_loss: 0.0029\n", "Epoch 20/20\n", "219/219 [==============================] - 37s 169ms/step - loss: 0.0032 - val_loss: 0.0029\n" ] } ] }, { "cell_type": "code", "execution_count": 21, "metadata": { "id": "42_p59jH5jk8", "outputId": "4bd43337-53b7-451b-87a7-4acf6cc28e3a", "colab": { "base_uri": "https://localhost:8080/" } }, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "63/63 [==============================] - 3s 47ms/step - loss: 0.0029\n" ] }, { "output_type": "execute_result", "data": { "text/plain": [ "0.002910560928285122" ] }, "metadata": {}, "execution_count": 21 } ], "source": [ "model.evaluate(X_valid, y_valid)" ] }, { "cell_type": "code", "execution_count": 22, "metadata": { "id": "fsQp1KDq5jk9", "outputId": "4d353299-c25f-4a96-da4b-f6d63ae212d6", "colab": { "base_uri": "https://localhost:8080/", "height": 291 } }, "outputs": [ { "output_type": "display_data", "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAZgAAAESCAYAAADAEMPrAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjIsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+WH4yJAAAgAElEQVR4nO3deXxU1fn48c8zyWQhIYGwJCxh0bJIkL1iFASLoFZBK+jP4sZXKlbrQhWhtnUp+tVS4au2oqKoqPBVtMpXW6kiCGURUbSoLEqrgGxhFUhC1pnn98edhEkcIMvcmYE879frvmbm3DP3PjOZzDPn3HvPEVXFGGOMCTdPtAMwxhhzcrIEY4wxxhWWYIwxxrjCEowxxhhXWIIxxhjjCkswxhhjXGEJxhhjjCsimmBEJENE5olIoYhsEZHRR6knIjJFRPYFlikiIkHrNbCNgsAyM3KvwhhjTE3ER3h/04FSIBPoBbwjIp+r6rpq9cYBlwI9AQXeBzYBTwfV6amq/3E/ZGOMMXURsRaMiKQAI4F7VLVAVZcDbwPXhKh+HTBNVbep6nZgGjAmUrEaY4ypv0i2YDoD5aq6Majsc2BQiLo5gXXB9XKq1VkqIh7gQ+AOVd0caqciMg6nRURycnLf7OzsukUf4Pf78Xhi49BVLMUCsRWPxRJaLMUCsRWPxRLaxo0b96pqizo9WVUjsgADgbxqZTcAS0LU9QFdgx53wukqk8Djc4AEoAnwBLAWiD9eDH379tX6Wrx4cb23ES6xFItqbMVjsYQWS7GoxlY8FktowGqt4/d+JFNkAZBWrSwNyK9B3TSgIPBiUdWlqlqqqgeA24GOwGnhD9kYY0xdRTLBbATiRaRTUFlPoPoBfgJlPWtQr4ICcoz1xhhjIixiCUZVC4E3gckikiIiZwOXAC+HqP4ScIeItBGR1sCdwCwAEckRkV4iEiciqTgnAGwHNkTidRhjjKmZSB9FuhlIBnYDrwA3qeo6ERkoIgVB9WYAfwO+xDm+8k6gDJxTnOcCh4BvgQ7AxapaFpFXYIwxpkYieh2Mqu7Hub6levkyIDXosQITA0v1uh8AXVwM0xhjTBhE+kJLY0w9HDp0iN27d1NWVrcGe3p6Ohs2xE5vcizF01BjSUlJoW3btq6cFm0JxpgTxKFDh9i1axdt2rQhOTmZoNGTaiw/P5/GjRu7EF3dxFI8DTEWv9/P9u3b2bt3Ly1btgz79mPjSh5jzHHt3r2bNm3a0KhRozolF2Oq83g8ZGZmcvDgQXe278pWjTFhV1ZWRnJycrTDMCcZr9dLeXm5K9u2BGPMCcRaLibc3PxMWYIxxhjjCkswxhhjXGEJxhhzwhkzZgwXX3xxrZ4zePBgbrnlFpciOuL++++ne/furu/nRGCnKRtjXHO8/v3Ro0czZ86cWm/38ccfrxhpvcbefPNNvF5vrfdl6s4SjDHGNTt37qy8//e//50bbrihSln1s5fKyspqlATS09NrHUtGRkatn2Pqx7rIjGlgVq6Ehx92bt2WlZVVuTRp0qRKWXFxMdnZ2bzyyiv85Cc/ITk5mRkzZrBv3z5+/vOf07ZtW5KTk8nJyeGFF16ost3qXWSDBw/m5ptv5re//S3NmzenZcuWTJgwAb/fX6VOcBdZhw4dePDBB7nxxhtJS0uja9euPPLII1X2s3HjRgYNGkRSUhJdunRh/vz5pKamMmvWrBq/B36/nwceeIDs7GwSExM5/fTTeeutt6rUmTx5Mu3btycxMZGsrCzGjRtXuW7p0qWceeaZpKamkp6ezhlnnMHatWtrvP9oshaMMSeo8eNhzZraPef775NZuxb8fvB4oEcPqE1joFcveOyx2u3zeO6++26mTp3Kc889h9frpbi4mD59+jBp0iTS0tJYuHAhN954I+3atWPIkCFH3c6cOXO4/fbb+fDDD1mzZg2jR4+mb9++/PznPz/qcx599FH+8Ic/cNdddzFv3jwmTpzIgAEDyM3Nxe/387Of/YysrCw++ugjioqKGD9+PCUlJbV6fY8//jiPPPIITz/9NP369WP27NlcdtllfPrpp/Tq1Ys33niDqVOn8sorr3D66aeze/dulixZAjgtvEsuuYSxY8cyZ84cysrK+Oyzz4iLi6tVDNFiCcaYBuTgQaHiR73fDwcP1i7BuOHWW29l1KhRVcruuuuuyvvjxo3jgw8+4JVXXjlmgunWrRuTJ08GoHPnzjz77LMsWrTomAlm2LBhla2aX/7ylzzzzDMsWrSI3Nxc3n//fb7++msWLFhAmzZtACchnX322bV6fVOnTmXChAmMHj0acForS5cuZerUqcyePZstW7bQqlUrhg0bhtfrpV27dnTp4ozne+jQIQ4cOMDw4cM59dRTAejatWut9h9NlmCMOUHVpSWxcGExI0akUFoKCQkwZw7k5oY/ttro169flcc+n48//vGPzJ07l+3bt1NSUkJpaSmDBw8+5nZ69OhR5XHr1q3ZvXt3nZ/z1Vdf0bp168rkAvDjH/+4VoNCHjp0iB07dvwgKQ0YMID58+cDcPnll/P444/TsWNHzj//fC644ALOPfdcGjduTEZGBmPGjOH8889nyJAhDBkyhFGjRtGuXbsaxxBNdgzGmAakf38/ixbBAw/AokXRTy7gjOYbbOrUqUybNo277rqLRYsWsWbNGi699FJKS0uPuZ3qJweISJVjMOF6TrhUnGGXnZ3N119/zYwZM0hLS+POO+/knHPOobCwEIAXXniBVatWcc455/D222/TpUsX3nvvvYjEWF+WYIxpYHJz4e67YyO5hLJ8+XKGDx/ONddcQ69evTj11FPZuHFjxOPo2rUrO3bsYMeOHZVlq1evrlUCSktLo3Xr1qxYsaJK+fLly+nWrVvl46SkJC666CIeffRRPvnkEzZs2FDlOT179mTSpEksWbKEwYMH8+KLL9bjlUWOdZEZY2JK586dmTt3LsuXL6d58+b85S9/YdOmTfTu3TuicQwdOpQuXbpw3XXXMXXqVIqKirjjjjuIj4+v1fhdd911F/feey+dOnWib9++zJ49m2XLlvHZZ58BMGvWLMrLy+nfvz+pqanMnTsXr9dLp06d2LRpEzNmzGDEiBG0adOGb7/9li+++IKbbrrJrZcdVpZgjDEx5fe//z2bNm3iwgsvJDk5mTFjxnDVVVexfv36iMbh8XiYN28ev/jFLzjjjDPo0KED06ZN47LLLiMpKanG27ntttvIz89n4sSJ7Nq1iy5duvDGG2/Qs2dPAJo0acKUKVOYMGECZWVldOvWjdmzZ9OxY0d27drFxo0bufzyy9m7dy+ZmZlcddVVTJo0ya2XHV6q2mCWvn37an0tXry43tsIl1iKRTW24jkZY1m/fn29t3Ho0KEwRBI+sRRPTWJZs2aNArp69eqoxxJOx/psAau1jt+51oIxxpijmDdvHikpKXTq1InNmzdzxx130LNnT/r06RPt0E4IlmCMMeYo8vPzmTRpElu3bqVp06YMHjyYRx991OblqSFLMMYYcxTXXnst1157bbTDOGHZacrGGGNcYQnGGGOMKyzBGGOMcYUlGGOMMa6wBGOMMcYVlmCMMca4whKMMSbm3X///XTv3v2oj0O55ZZbjjvEf1327Zbqs3SeDCzBGGNcM2LEiKNOErZhwwbS0tJYsGBBrbc7YcIE/vnPf9Y3vCq2bNmCiLB69WrX99VQWIIxxrhm7NixLF68mM2bN/9g3XPPPUe7du0477zzar3d1NRUmjVrFoYIY2tfJxtLMMY0NCtXwsMPO7cuu+iii8jMzOSFF16oUl5WVsbLL7/M1VdfjaoyduxYOnbsSHJyMp06deJPf/rTMeddqd5t5fP5mDBhAk2bNqVp06aMHz8en89X5TnvvvsuAwcOpGnTpmRkZHD++eezYcOGyvWnn3464MxaKSKV3WvV9+X3+3nggQfIzs4mMTGR008/nbfeeqty/ebNmxER3njjDYYOHUqjRo3o1q0b77//fq3eu5KSEsaPH09mZiZJSUmceeaZLF++vMp7eNttt9G6dWsSExPJzs7mN7/5TeX6N998kx49epCcnExGRgaDBg1i165dtYqhvmyoGGNOVOPHw5o1tXpK8vffw9q14PeDxwM9ekB6es030KtXreZqjo+P57rrrmPWrFncd999ldMN/+1vf2Pv3r1cffXV+P1+2rRpw2uvvUaLFi34+OOPGTduHM2aNWPs2LE12s+0adN49tlnefbZZ+nRowfTp09nzpw5VQalLCwsZPz48fTo0YOioiIefPBBhg8fzvr160lISGDx4sWce+65vPvuu/Ts2ZOEhISQ+3r88cd55JFHePrpp+nXrx+zZ8/msssu49NPP6VXr16V9X73u9/xyCOP8OSTT/Lggw9y5ZVXsmXLFlJTU2v0miZOnMhrr73G888/zymnnML//M//cMEFF/Dvf/+bVq1a8ec//5l58+bx6quv0qFDB7Zt28bXX38NQF5eHldeeSUPP/wwI0eOpKCggI8++qhG+w0nSzDGNCBy8KCTXMC5PXiwdgmmDsaOHcuUKVNYuHAhw4YNA5zusWHDhtG2bVu8Xi+TJ0+urN+hQwc+++wzXnnllRonmMcee4yJEydyxRVXAE4SqD6t8MiRI6s8fuGFF0hLS+Pjjz9mwIABNG/eHIBmzZqRlZV11H1NnTqVCRMmMHr0aAAmT57M0qVLmTp1KrNnz66s9+tf/5rhw4cD8NBDD/HSSy+xZs0aBgwYcNzXU1hYyFNPPcXMmTO56KKLAHj66af54IMPmD59Og8++CBbtmyhc+fODBw4EBGhXbt2nHXWWQDs2LGDsrIyRo0aRfv27QEicqJCdRFNMCKSATwHDAP2Aner6v+GqCfAH4FfBIpmAr8JzE0QXO9a4EXgBlWd6WbsxsScWrQkKhQvXEjKiBFQWgoJCTBnjutzJ3fq1IlBgwbx/PPPM2zYMHbs2MF7773Hq6++Wlnn6aefZubMmWzZsoWioiLKysoqvxiP5+DBg+zcuZPcoNfh8Xjo378/W7durSz75ptvuOeee1i1ahV79uzB7/fj9/v57rvvavxaDh06xI4dOzj77LOrlA8YMID58+dXKevRo0fl/datWwOwe/fuGu1n06ZNlJWVVdlPXFwcubm5lROvjRkzhqFDh9K5c2eGDRvGT3/6Uy688EI8Hg89e/bkvPPOo3v37gwbNozzzjuPUaNG0aJFixq/1nCI9DGY6UApkAlcBTwlIjkh6o0DLgV6Aj2A4cCNwRVEpCnwW2CdmwEbczLx9+8PixbBAw84ty4nlwpjx47l//7v/9i/fz+zZs0iIyODSy65BIC5c+cyfvx4xowZw3vvvceaNWu4+eabKS0tDWsMF198MXv27GHGjBmsWrWKf/3rX8THx4dtP9WH8Pd6vT9Yd6zjSrXdT58+fdi8eTMPP/wwfr+f6667jqFDh+L3+4mLi2PBggUsWLCAHj168Nxzz9GpUyc+//zzeu+/NiKWYEQkBRgJ3KOqBaq6HHgbuCZE9euAaaq6TVW3A9OAMdXqPAz8GaclZIypqdxcuPvuiCUXgFGjRpGUlMTs2bN5/vnnufbaayu/gJcvX07//v255ZZb6NOnDz/60Y/45ptvarzt9PR0WrVqVeUYg6ry8ccfVz7et28fX331Fb/97W8577zzOO2008jPz6e8vLyyTsUxl+onBwRLS0ujdevWrFixokr58uXL6datW41jPp6OHTuSkJBQZT8+n4+VK1dW2U/jxo0ZNWoUTz31FO+88w4ffPAB//nPfwAnEeXm5nLffffxySef0Lp1a+bOnRu2GGsikl1knYFyVd0YVPY5MChE3ZzAuuB6lS0dETkD6AfcDFxxrJ2KyDicFhGZmZksWbKkLrFXKigoqPc2wiWWYoHYiudkjCU9PZ38/Px6bcPn89V7G3U1atQo7rvvPg4cOMCVV15Jfn4+Pp+Pdu3aMWvWLN544w1OOeUU3njjDf75z3/SpEmTylhLSkrw+/1HffzLX/6SP/3pT2RnZ5OTk8Ozzz7Lzp07admyJfn5+cTHx9OsWTOefPJJmjZtys6dO/n9739PfHw8xcXF5Ofnk5GRQXJyMm+//TbNmzcnMTGR9PT0H+zr1ltv5aGHHqJt27b06tWLuXPnsmzZMpYtW0Z+fj4FBQWAcxyl+ntdVFR01Pe/rKyM8vJy8vPzSUpKYuzYsUycOJFGjRrRoUMHpk+fzq5du7j22mvJz8/niSeeIDMzkx49ehAfH8+sWbNIS0sjPT2dRYsWsWTJEoYMGULLli354osv2Lp1Kx07dgy5/+LiYnf+X+o613JtF2AgkFet7AZgSYi6PqBr0ONOgAICxAGrgTMD65YAv6hJDH379q3R/NTHcjLO9R4usRTPyRjLseZNr6lIz/Ue7NNPP1VAzzrrrCrxlJSU6PXXX69NmjTR9PR0vf766/UPf/iDtm/fvrLefffdpzk5OUd9XFZWpuPHj9f09HRNT0/XW265RX/5y1/qoEGDKussWrRIc3JyNDExUXNycvTdd9/VlJQUfeGFFypjefbZZzU7O1s9Hk/lc6vvy+fz6eTJk7Vt27bq9Xq1e/fuOm/evMr1mzZtUkA/+eSTKq8f0Ndff/2o7891112nF110UWUsxcXFevvtt2vLli01ISFB+/fvr8uWLaus/8wzz2jv3r01NTVVGzdurOecc46uWLFCVZ3PygUXXFD53FNPPVWnTJly1H0f67MFrNa6fu/X9Ym13hH0Bg5XK7sT+FuIugeBM4Ie9wXyA/dvBZ4PWmcJJkbEUjwnYywneoIJJZbiacixuJVgInmQfyMQLyKdgsp6Evog/brAulD1hgA/E5E8EckDzgKmicgTLsRsjDGmjiJ2DEZVC0XkTWCyiPwC6AVcgpMgqnsJuENE5uN0jd0J/CWwbgyQFFT3TeCvOKc/G2OMiRGRvtDyZuB5YDewD7hJVdeJyEDgH6pacYnrDOAU4MvA45mBMlT1QPAGRaQUOKSqByMQvzHGmBqKaIJR1f0417dUL18GpAY9VmBiYDneNgeHMURjjDFhYoNdGnMC0aqDWRhTb25+pizBGHOC8Hq9FBUVRTsMc5IpKysjPt6dzixLMMacIFq2bMn27ds5fPiwtWRMWPj9fnbt2kW6SwOe2mjKxpwg0tLSgCMj5dZFcXExSUlJx68YIbEUT0ONJSUlpXIk6XCzBGPMCSQtLa0y0dTFkiVL6N27dxgjqp9YisdiCT/rIjPGGOMKSzDGGGNcYQnGGGOMKyzBGGOMcYUlGGOMMa6wBGOMMcYVlmCMMca4whKMMcYYV1iCMcYY4wpLMMYYY1xhCcYYY4wrLMEYY4xxhSUYY4wxrrAEY4wxxhWWYIwxxrjCEowxxhhXWIIxxhjjCkswxhhjXGEJxhhjjCsswRhjjHGFJRhjjDGusARjjDHGFZZgjDHGuMISjDHGGFdYgjHGGOMKSzDGGGNcYQnGGGOMKyzBGGOMcUVEE4yIZIjIPBEpFJEtIjL6KPVERKaIyL7AMkVEJLCuuYisCJQfEJGVInJ2JF+HMcaY44uP8P6mA6VAJtALeEdEPlfVddXqjQMuBXoCCrwPbAKeBgqA64F/B9ZdAvxNRFqqanlEXoUxxpjjilgLRkRSgJHAPapaoKrLgbeBa0JUvw6YpqrbVHU7MA0YA6Cqxar6tar6AQF8QFMgIwIvwxhjTA2JqkZmRyK9gRWq2iiobAIwSFWHV6t7EBimqqsCj/sBi1W1cVCdL4CugBeYqao3HGW/43BaRGRmZvZ99dVX6/U6CgoKSE1Nrdc2wiWWYoHYisdiCS2WYoHYisdiCe3cc8/9VFX71enJqhqRBRgI5FUruwFYEqKuD+ga9LgTTneYVKuXBPwcuK4mMfTt21fra/HixfXeRrjEUiyqsRWPxRJaLMWiGlvxWCyhAau1jt/7kTwGUwCkVStLA/JrUDcNKAi82EqqWgy8IiIbRGSNqn4ezoCNMcbUXSTPItsIxItIp6CynkD1A/wEynrWoF4FL3BKvSM0xhgTNhFLMKpaCLwJTBaRlMCpxZcAL4eo/hJwh4i0EZHWwJ3ALAAROVNEBohIgogki8gknLPSVkXkhRhjjKmRSJ+mfDPwPLAb2AfcpKrrRGQg8A9VrTiqNQOnRfJl4PHMQBlAIvDnwPqyQJ2LVHXH8Xbu94frZRhjjDmeiCYYVd2Pc31L9fJlQGrQYwUmBpbqdf9J1e6zGisrq8uzjDHG1EWDGirGEowxxkROg0ow5XadvzHGREyDSjClpdGOwBhjGo56JxgR8YYjkEiwFowxxkROrRKMiNwmIiODHj8HFInI1yLSJezRhZkdgzHGmMipbQvmNmAPgIicA1wBjAbW4AxIGdMswRhjTOTU9jTlNjjD5gMMB15X1ddE5EtgWVgjc4ElGGOMiZzatmAOAS0D94cCiwL3y3AGnoxplmCMMSZyatuCWQA8KyKfAT8C/hEoz+FIyyZmlZWBzwdxcdGOxBhjTn61bcH8ClgBtABGBa7MB+gDvBLOwNyyd2+0IzDGmIahVi0YVT0E3Bqi/L6wReSyvDzIzIx2FMYYc/Kr7WnK3YJPRxaRoSIyW0TuFpETouNp585oR2CMMQ1DbbvIngd6A4hINvAWkIHTdfZgeENzR15etCMwxpiGobYJpivwWeD+KGCVqv4UuAZn6uKYZy0YY4yJjNommDigYkSvIcD8wP1vcCb9imkej7VgjDEmUmqbYNYCNwUmCBsCvBsobwPE/PlZXq+1YIwxJlJqm2AmATcAS4BXVLVixskRwMdhjMsVlmCMMSZyanua8lIRaQGkqer3QatmAIfDGpkLvF7rIjPGmEip9ZTJquoTkSIR6Q4o8I2qbg57ZC6wFowxxkROba+DiReRR4Dvgc+BL4HvReRPJ8K8MF4vFBZCQUG0IzHGmJNfbY/B/Am4Gvgl0BnoBNyEc5ryw+ENLfy8gRRorRhjjHFfbbvIRgPXq+r8oLJvRGQPMBOYELbIXFCRYPLyoFOn6MZijDEnu9q2YNJxrnmp7hugSf3DcZe1YIwxJnJqm2A+x5nVsrrbA+tiWnALxhhjjLtq20U2EZgvIucBHwXKzgRaAxeGMzA3xMfbmWTGGBMptWrBqOpSnIP7fwVSA8vrwPmEbtnEnMxMSzDGGBMJdbkOZgfwu+AyEekJjAxXUG5q1cq6yIwxJhJqewzmhJeVZS0YY4yJhAaXYKwFY4wxkdEgE8yePVBeHu1IjDHm5FajYzAi8vZxqqSFIZaIyMoCVdi9G1q3jnY0xhhz8qrpQf59NVi/qZ6xRESrVs7tzp2WYIwxxk01SjCq+l9uBxIpWVnOrR2HMcYYdzXIYzBgZ5IZY4zbIppgRCRDROaJSKGIbBGR0UepJyIyRUT2BZYpIiKBdZ1F5C0R2SMi+0XkPRHpUtMYMjOdW0swxhjjrki3YKYDpUAmcBXwlIjkhKg3DrgU6An0AIYDNwbWNQHeBroEtvMx8FZNA0hMhIwM6yIzxhi3RSzBiEgKztX+96hqgaoux0kU14Sofh0wTVW3qep2YBowBkBVP1bV51R1v6qWAY8CXUSkWU1jsYstjTHGfaKqkdmRSG9ghao2CiqbAAxS1eHV6h4EhqnqqsDjfsBiVW0cYruXAk+paquj7HccTouIzMzMvq+++ip33tmTkhIPTzzxr1q/joKCAlJTU2v9PDfEUiwQW/FYLKHFUiwQW/FYLKGde+65n6pqvzo9WVUjsgADgbxqZTcAS0LU9QFdgx53ApRAQgwqbwtsB35ekxj69u2rqqpXX63aoYPWyeLFi+v2RBfEUiyqsRWPxRJaLMWiGlvxWCyhAau1jt/7kTwGU8APL8hMA/JrUDcNKAi8WABEpAWwAHhSVV+pTSBZWc4xmAg13owxpkGKZILZCMSLSPBkxT2BdSHqrgusC1lPRJriJJe3VfW/axtIq1ZQXAwHD9b2mcYYY2oqYglGVQuBN4HJIpIiImcDlwAvh6j+EnCHiLQRkdbAncAsABFJA97DOZ7zm7rEUnGxpR3oN8YY90T6NOWbgWRgN/AKcJOqrhORgSJSEFRvBvA34EtgLfBOoAzgZ8CPgf8SkYKgpV1Ng6i42NJOVTbGGPfUesKx+lDV/TjXt1QvX4YzO2bFY8WZnnliiLovAi/WJw5rwRhjjPsa3FAxYC0YY4yJhAaZYNLTISnJWjDGGOOmBplgRI6cqmyMMcYdDTLBgNNNZi0YY4xxT4NNMNaCMcYYdzXYBGMtGGOMcVeDTTBZWbB/P5SURDsSY4w5OTXYBFNxqvKuXdGNwxhjTlYNNsHYxZbGGOOuBptg7GJLY4xxV4NPMNaCMcYYdzTYBNOypXPBpbVgjDHGHQ02wcTHQ4sW1oIxxhi3NNgEA3axpTHGuKlBJxi72NIYY9zToBNMVpYlGGOMcUuDTjCtWjkXWvr90Y7EGGNOPg0+wZSVOUPGGGOMCa8GnWAqrua3A/3GGBN+DTrB2MWWxhjjngadYKwFY4wx7mnQCcZaMMYY454GnWBSUyElxVowxhjjhgadYMAutjTGGLdYgrEEY4wxrmjwCcbGIzPGGHc0+ARjLRhjjHFHg08wWVlw6BAcPhztSIwx5uTS4BOMTZ1sjDHuaPAJxi62NMYYdzT4BGMXWxpjjDsafIKpaMFYgjHGmPBq8AmmRQuIi7MuMmOMCbeIJhgRyRCReSJSKCJbRGT0UeqJiEwRkX2BZYqISND6Z0TkaxHxi8iY+sTk8UBmprVgjDEm3CLdgpkOlAKZwFXAUyKSE6LeOOBSoCfQAxgO3Bi0/nPgZuCzcARlF1saY0z4RSzBiEgKMBK4R1ULVHU58DZwTYjq1wHTVHWbqm4HpgFjKlaq6nRVXQQUhyM2u9jSGGPCT1Q1MjsS6Q2sUNVGQWUTgEGqOrxa3YPAMFVdFXjcD1isqo2r1VsOzFTVWcfY7zicFhGZmZl9X3311R/UeeSRLqxalcFf/7ryuK+joKCA1NTU49aLhFiKBWIrHosltFiKBWIrHosltHPPPfdTVe1XpyerakQWYCCQV63sBmBJiLo+oGvQ406AEkiIQeXLgWqAcBwAABjnSURBVDE1jaFv374ayu9/r+rxqJaXh1xdxeLFi49fKUJiKRbV2IrHYgktlmJRja14LJbQgNVax+/9SB6DKQDSqpWlAfk1qJsGFARebNhlZYHfD3v3urF1Y4xpmCKZYDYC8SLSKaisJ7AuRN11gXXHq1c7eXmw8ofdYHaxpTHGhF/EEoyqFgJvApNFJEVEzgYuAV4OUf0l4A4RaSMirYE7gVkVK0UkQUSSAAG8IpIkIsd/Ldu3w5AhP0gylmCMMSb8In2a8s1AMrAbeAW4SVXXichAESkIqjcD+BvwJbAWeCdQVmEBUAScBTwTuH9OjSIoKoKnn65SZOORGWNM+MVHcmequh/n+pbq5cuA1KDHCkwMLKG2M7jOQYjASy9BQQE8/ji0bWvDxRhjjAsa1lAxbdrAkiXw0EMwfz6cdho89hjJ3nLS060FY4wx4dSwEkxWFpxzDtx9N6xbBwMGwK9/DWecwdAmn1gLxhhjwqhhJZhgp5zitGJeew3y8pi7pT9XrrgVDh6MdmTGGHNSaLgJBpzjMZdfDhs28H6nX/GzndOdbrPXXoMIjXBgjDEnq4adYCqkp/NCn79wVtzHFKS1gv/3/+CnP4Vvv412ZMYYc8KyBINzWcybb8IqXz9abVnFptsfg+XLISfHOSGgtDTaIRpjzAnHEgzOiWU+n3O/oDieX3x5O7v++RVcdBH87nfQuzcsWxbVGI0x5kRjCQYYPBgSE53Jxzwe+OADyD6zDVcl/pWvp/0dCguds88uvpgOzz0XcrgZY4wxVVmCAXJzYdEiePBBp2ds40a4+Wb4+9+h650XMaj5Or496yr0nXdoP3u2k2wWLIh22MYYE9MswQTk5jqXx+TmQqdO8NhjsG0bPPEE7CpI4dkPc/DjQQDKy2HECLj3Xti/P9qhG2NMTLIEcwyNG8OvfgXr18Oljw6m3JNIGXEUk8j69Fx44AFo3x5++1sb698YY6qxBFMDHg/0H59L4vJFrLtiHE9fsZjc4sV050sWJFyE/vGPaIcOMHEi7NoV7XCNMSYmWIKpjdxcDtx0BePn5rJtG9z0RHdua/Eq3XQdb/guxT91Gr52Hfl4wB2s/puNO2OMadgswdRRcPfZ4++dxqwhs+mqG3i59Ar6rPgz3Ud05OPc29i0bBt+f7SjNcaYyLMEU08eDwwb5pxxNuLOzlwvs+jC18zmanp/9BStzzmV5xJv5rJ+3/GrX8HMmbB6NRQXRztyY4xxV0TngznZjRwJTz4JW0pP5baEmTSZ/Hu6//2P/NeymYz5bCazvxjD5LK7ySKPn8gSNncYjOfsXHr1onJp1szZ1sqVzgWggwc7Z7YZY8yJxhJMGFVcT3MkMXSACU/D1t/BlCmMefZZxnieQxHwK+VbvFx/cB4TZl+AM/szZGdDu3awapUzuoDX65wyPXgwtGgBGRlOq8kYY2KdJZgwy80N0eLIzoYnnkDuvhuuuAL58EMAEvwlzN7/U15OTuZwRlv2JmezxZfNJ/9qx2nl2Wwlm62l2Uy6OZt80gAnuTRrBuelrKR/8Qcs6JrI911zadECmjd3klDFsmULfPEFDBlyjFZQLDWVYikWY0y9WYKJpDZtYOpU5xu/tBTi4uCmm5C4OFK2biVl61bab13IwOKdCFXPDChNTiM/PZt9KdkcLk+g+5b5eNRHed4U/vLxRL4s6sQmdVpBStXb9fcIc5pCRjOhcZrQuDE0ThPal2wk94MH8fjKUa+XPX+ZS8qoC0lpmoDIkX2H43s/5Db8fti3z5mretEimDTJuYg1IQFefx0uvpgqgRhjTiiWYCLth/1oP6gi5eV8+rcdrHtvK2e23krn5K0kbN1Ks8DCV1+BlgOQQCl3Hn7w+Pv9PrAchZSWkHnjpXAj7CODffFZHEjO4vuELDbsz+SAZvGMJ4uFZ2WR2C6TwsZZlDRujscbR1wctN+xkiafv8OL/RLJa9+f1JJ9pBfuoHHBToq/3cG6hTvJ8u9gt+xga+udZBTvIOlAHnG+sh8GU1ICI0ZQltqUwg45FHboRlHHHIpOyWFNWQ4fbcok9yyhf39nDLmkpCOL11uznGSNJWPcZwkmGkL2owWJj6fvz9rR92ftQq9fuRKGDEFLSpCEBHj5ZWfEZ6icKO1fnynXXgvlZUqCV5k1C3r30so6qlC08l8k/WosUlaG3xPH2qG/5lB5CnF78kjYn0dqfh4tD6xigO4khcPgB5YfCcOHh920JJ9UTuVbPPjRT6fhx0M8vh+EvZ+m7NDWfLW9FTs4l520Yget2UkrmrKfxxlPPGX4iGc6N5NScJictevIWfs6bXgGgNOBn5LB+qe6sYgc1gUtu2kJCIMSVjKY5TySksi6tNzK5FORjIqLnTP5/H6nEXnBBdC2rbM+IeHIbfD9UGXffOPMvN2/P5x5JjRq5CzJyc5tXFzNPg7hSnautTRjOZYaVLIfE9FjCeZEFGgFbXr+eU65/vqQ/zW9fwTPZB/5x+pdrYoAjbqdBt06wpIlxA0eTM8Q21m5EroNAW9JAdnePF59fBfdm+dBXh5xeXm02rWLVsuWoV/5nQ45ATl7AMUjRlLeohXlLVvz2c5WXParVhSUJ+H1Ovnw7F7OSQx+/5Hb7z7rQerqJRzsPZizc3Iry78oV+L37WLlzHVsW7CObuqklKu9c0ktO1AZ6+HkDA6mZdNy91pE/fjLJ/N+1q18l3Iahf5kCkuSKDiczJZdSXj8yRSTRJEvme+WJrE5MZlDZckcKk2ioMSLzy+cyUoGs4QlDOYjjv7N9MwzocsTEmBQwkoG+pcxvWkiX2fk/iAJFRY646ZWJLvLLoPWrZ1WWG2W7dud99Xnc7Zz6aXOMbnS0iNLx7yVtPt2EfNaJfJ5o9wq60pL4dAhyMtzfqOIOEm3SZMjSbUisfYoXEnPA0v4d+vBfNcmt8q6xERnGy++6PR2xsXB6NHOMcHSUqdxWlLi3G+/YyXtNi1kXssE1if1QYtL0OISpKSYkkMlHNhVQgIl/J1imqWUkOwpIUFLSNRivFpC5/L1/Kr0UeIox4eXe73/zUrPAPZrBnu1Gd9rE8o1rnIqDoBWrZz3pXHjI0tqqnO7f39HVq4Mva5xY2cQ3FWrnB8Tffs621Ot2W3F/X/9y9nGWWc5/7Ze75ElIcF5vyLZAnc7+Yo2oKmB+/Xrp6tXr67XNpYsWcLgwYPDE1A9RSqW434IAy0qf0kJnsREpwuwWsVw/aKtOHyVkACLFiq5HfOcZkTFsmCBc3ZDfXg8qNfrfBMCIPhaZKLJKfjj4vF74tl3IJ68ffH4iKOceDJaxJPePJ5y4inXeMqIJ77wIB23LUfUh1/i+CRrBDsS2lPkS6TQl8jh8kT25idyoDiREpzFH5+ILz6RYpIoIZFidco7+TbQ17+aT6Ufa+V0UEXUf2TBj6B48OPBT1KCn8aN/CTE+/HGK6f51zJhz2+I0zJ8Es8LHf7A7sY/IiHOR2JcOV6Pj907fezcWo4HH/H4aNvKR2azcvzlPrTMB+XlNC/czMV7X8SDDz9xvJt0CfvJIN5XSpyvlHh/CQmUVi6JOI+TpJREKSGRUhKklEb+AhrrocC76w4/QkF8E3aXZ7CPZuwnA3/TZmgT5/EeXwa7yzPYWdqMJoXbOaVoLUsZyAoGcJhGFJGMnxo2QwNq+qPkWAbErWSwfMCqpJ+wJjm3MvlUJKLSUqf1rOqc9NO1q5MAK1R8pXfPX0nf/CV82ngwaxvn/mAW+IICp7ddxPlhEOLfFlaupO1ZZ23fptq2Lq/FEkwtNcQEUyMrV/LtMVpUYdxN7ZLdG29A9+5Ov1hRUeXt+s+KWf9ZET1+VETndlXXUVzs7GTFiiM/53v3htNOc36Wl5ezf3c5q1aU4/GX4/WU07dnOekpvsr1lJc7Jy8Ej01X0W9W8RP+ZJCcDOnpVfoOC0oTWPefREo0gTJPAn3OTKRpZkLVPsYvv4SPPz7y/g4dCuefX9mP+e/vErn/j4kUlifii0/ij48mktMn8UhfZ2IirF3rTG9eVuZ88/7lL86JNPv2OaOc79vHzvX7WTpvH018+2km++neah9Jh/fDgQPHf22AP96LL7ERBb5k9hc7SaeIZBo1b0R6VjLlCY3wJSTj8ybjLTpI+8/+D/H7UE8c23tcSGmjJsT5SvGUl7J3ewn780rxBhJvi/RSmjQqxVNWiqe8BE95Kd7SQhJL8yv3fzghnbL4RpSLl3LxUiYJFJZ6KSjxUkoCZXjxJnvxpngplwTKPV584iWl/AB99r2PqB8VDx83/ykHEjOrJPNDh+Cgk+MRgX59nWvxKu3aBfPn08/nY7VqnX4HWBeZCY/cXL4rKeEUlzu5j3f4qqL7cPNxkl23QdDtWDuq3lx64okq28oAmgQlu/SaJLuFC49sQ7Wyz+iT5SWsWlrCWX1L6JNT4iS4ir6kkhKYM8dZ/H7nJ+vPfw6jRh2ZIc/jARHWf+Xh8y899OjtIae7U1a5fv16uO02tKwM8Xphxgzo0wfi452kF1g+/TyeFR/FkTsgjh/nVl1HfLyTFIYODWpG/vBnbyrgXwkrAu9N05q8N/ffX2U7nYBbLjjy/uaE2sYppzizAx7jF0croF3Q3ympokp5uZNk9u+HRx91+jkr3t+LLnLmfDp8GE9REZ6iIko3HWbVO0Uk+Q6T4imic7si0uO/h6IdcPCw88Nk/37wOyffiL+c7G+WONcUBBJvRpMEvtydSIk/gXxPOp1PTySjeuL94gunDy2QeFN6d4XTT3eSaGDZn1fK5hVlxPnLSPSU0v1HRaQlHXLWl5Y6t3v2gAb6BtVHbvES8Dau8t6UJsBeQAFRaL4Z2BFUIT+fKv2LdaGqDWbp27ev1tfixYvrvY1wiaVYVGMrnrDE8uGHqg895NzWYxvf/OIX9d6GJierxsU5t3XdVjhiqYinvu9LOOOpr8D76/N4jvn+Hvdl1+DvVNNtRCKW424nsI0+4Nc6fudG/Us/koslGHfFUjwnXSxh+lKPpfdFNYbiiaXkG2OxtIFtWsfvXOsiM+ZEcNy+QVMv4eriDcffKcZi2Q55dX26jWpljDHGFZZgjDHGuMISjDHGGFdYgjHGGOMKSzDGGGNcYQnGGGOMKyKaYEQkQ0TmiUihiGwRkdFHqSciMkVE9gWWKSJHhoATkV4i8qmIHA7c9gq1HWOMMdET6RbMdKAUyASuAp4SkZwQ9cYBlwI9gR7AcOBGABFJAN4CZgNNgReBtwLlxhhjYkTEEoyIpAAjgXtUtUBVlwNvA9eEqH4dME1Vt6nqdmAaMCawbjDOGGqPqWqJqv4ZZ0DWn7j8EowxxtRCJK/k7wyUq+rGoLLPgUEh6uYE1gXXywla94WqBg8D/UWg/N3qGxKRcTgtIoACEfm6buFXao4zRlwsiKVYILbisVhCi6VYILbisVhC61LXJ0YywaQCh6qVHQQaH6XuwWr1UgPHYaqvO9Z2UNVngKNMCVV7IrJaVfuFa3v1EUuxQGzFY7GEFkuxQGzFY7GEJiJ1nuMkksdgCoC0amVpQH4N6qYBBYFWS222Y4wxJkoimWA2AvEi0imorCewLkTddYF1oeqtA3oEn1WGcyJAqO0YY4yJkoglGFUtBN4EJotIioicDVwCvByi+kvAHSLSRkRaA3cCswLrlgA+4DYRSRSRWwLlH7gZf5CwdbeFQSzFArEVj8USWizFArEVj8USWp1jieiUySKSATwPDAX2Ab9R1f8VkYHAP1Q1NVBPgCnALwJPnQlMqjiwLyK9A2XdgA3AWFX9V8ReiDHGmOOKaIIxxhjTcNhQMcYYY1xhCcYYY4wrLMHUQOBkgucC46fli8gaEbkwBuLqJCLFIjI7BmK5UkQ2BMaZ+yZwXC0acXQQkfki8r2I5InIEyISkeu9ROQWEVktIiUiMqvauiEi8lVg/LzFItI+GrGIyJki8r6I7BeRPSLyuoi0ikYs1ercKyIqIue5Gcvx4hGRRiLypIjsFZGDIrI0irFcEfifyheR9SJyqcuxHPN7ri6fYUswNRMPbMUZdSAd+D3wmoh0iGJM4Izt9kmUY0BEhuKclPFfOBe8ngN8G6VwngR2A62AXjh/s5sjtO8dwIM4J7JUEpHmOGdQ3gNkAKuBudGIBWf8vmeADkB7nOvHXohSLACIyKnA5cBOl+OoSTzP4PyNTgvc/joasYhIG5zxFu/Auc7vLuB/RaSli7Ec9Xuurp/hSF7Jf8IKnGJ9f1DR30VkE9AX2ByNmETkSuAA8CHwo2jEEOQPwGRV/SjweHsUY+kIPKGqxUCeiLzLkWGGXKWqbwKISD+gbdCqy4B1qvp6YP39wF4R6aqqX0UyFlX9R3A9EXkC+KcbMRwvliDTgUk4Pw5cd7R4RKQrMAJoq6oVo458Go1YAvcPBP293hGRQuBUnB9QbsRyrO+5ZtThM2wtmDoQkUycsdWicnGniKQBk3F+3USViMQB/YAWIvIfEdkW6JZKjlJIjwFXBro62gAXEmKMugirMrZe4B/5GyKU+I7jHKJ4kbKIXA6UqOr8aMUQ5AxgC/CHQBfZlyIyMkqxrAY2iMgIEYkLdI+V4Iy7GBHVvufq9Bm2BFNLIuIF5gAvuvXrswYeAJ5T1W1R2n+wTMALjAIG4nRL9cZpXkfDUpwP/SFgG84/6v9FKZYKtRo/L1JEpAdwL073SzT23xh4CLg9GvsPoS3QHedv0xq4BXhRRE6LdCCq6sO54Px/cRLL/wI3Br7YXRfie65On2FLMLUgIh6ckQdKcT580YihF3Ae8Gg09h9CUeD2L6q6U1X3Av8D/DTSgQT+Pu/i9BWn4IxI2xTn+FA0xdz4eSLyI+AfwO2quixKYdwPvKyqm6O0/+qKgDLgQVUtVdV/AouBYZEOJHCyw59wpidJwDkuMlMiMLniUb7n6vQZtgRTQyIiwHM4v9hHqmpZlEIZjHOA9jsRyQMmACNF5LNoBKOq3+O0FIKv2I3W1bsZQDucYzAlqroP5wB2xJNdNVXG1hNnbqRTiV4Xa3tgIfCAqoYaqilShuAM+ZQX+Cxn4xxUnhSleEJ1P0Xrs9wLWKqqq1XVr6qfAKtwfly65hjfc3X6DFuCqbmncM4sGa6qRcer7KJncP6wvQLL08A7wPlRjOkF4FYRaSkiTXHOvPl7pIMItJ42ATeJSLyINMGZvC4i/daBfSYBcUCciCSJc4r0PKC7iIwMrL8XZ04j17pYjxZL4LjUBzhJ+Gm39l+TWHASTHeOfJZ34MxcOz1K8SwFvgPuDtQ5GzgXeC8KsXwCDKxosYgzPNZA3P8sH+17rm6fYVW15TgLzumcChTjNBUrlqtiILb7gdlRjsGLcwbQASAP+DOQFKVYeuEMiPo9zoRNrwGZEfxbaLXl/sC684CvcLphlgAdohELcF/gfvDnuCBa70u1epuB86L8d8oBVgKFwHrgZ1GM5RbgPzjdUN8Cd7ocyzG/5+ryGbaxyIwxxrjCusiMMca4whKMMcYYV1iCMcYY4wpLMMYYY1xhCcYYY4wrLMEYY4xxhSUYY04Q4syXMiracRhTU5ZgjKkBEZkV+IKvvnx0/Gcb0zDZfDDG1NxC4JpqZaXRCMSYE4G1YIypuRJVzau27IfK7qtbROSdwJSyW0Tk6uAni8jpIrJQRIrEmbJ4loikV6tzXWAekhIR2SUiL1aLIUOcaY4LReTbEPu4N7DvksAgki+58k4YUwOWYIwJnz8Ab+OMh/YM8FJgpsKK0Wffwxnb6QzgZ8BZBE2VKyI3AjNwBg/tgTMK9Npq+7gXeAtnZNu5wPMi0i7w/JE4o2vfDHQCLgY+duF1GlMjNhaZMTUgIrOAq3EGAgw2XVUniYgCM1X1hqDnLATyVPVqEbkBmIozHW9+YP1gnPlGOqnqf0RkG87Apb85SgwK/FFV7w48jseZWG2cqs4WkTtwRiPurtGbTsKYSnYMxpiaWwqMq1Z2IOj+ymrrVgIXBe6fhjO8efAETR8CfqCbiBwC2gCLjhND5XDtqlouInuAloGi13Fmh9wkIu/hTL72tqqWHGebxrjCusiMqbnDqvqfasveMGy3Nt0I1VsmSuD/WFW3Al1wWjGHgGnAp4HuOWMizhKMMeFzZojHGwL3NwCnB+ahr3AWzv/gBlXdDWzHmYSrzlS1WFXfUdVfAz/Gmd/k7Pps05i6si4yY2ouUUSyqpX5VHVP4P5lIvIJzmRMo3CSRf/Aujk4JwG8JCL3Ak1xDui/qar/CdT5b+BREdmFM0tpI2CIqk6rSXAiMgbnf3oVzskE/w+nxfPvWr5OY8LCEowxNXcesLNa2XagbeD+/cBInBk99wD/pc5c6qjqYRE5H3gM58yuYpyzwW6v2JCqPiUipcCdwBRgPzC/FvEdACbhnEzgxZmR8TJV3VSLbRgTNnYWmTFhEDjD63JV/Wu0YzEmVtgxGGOMMa6wBGOMMcYV1kVmjDHGFdaCMcYY4wpLMMYYY1xhCcYYY4wrLMEYY4xxhSUYY4wxrvj/LfV5tIs8PbYAAAAASUVORK5CYII=\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" } } ], "source": [ "plot_learning_curves(history.history[\"loss\"], history.history[\"val_loss\"])\n", "plt.show()" ] }, { "cell_type": "code", "execution_count": 23, "metadata": { "id": "HAS8HZnX5jk9", "outputId": "f53af0f5-72cc-43d4-d48c-d6214f166d76", "colab": { "base_uri": "https://localhost:8080/", "height": 293 } }, "outputs": [ { "output_type": "display_data", "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" } } ], "source": [ "y_pred = model.predict(X_valid)\n", "plot_series(X_valid[0, :, 0], y_valid[0, 0], y_pred[0, 0])\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": { "id": "7Mk8hrHT5jk9" }, "source": [ "Since a SimpleRNN layer uses the tanh activation function by default, the predicted values must lie within the range –1 to 1. It might be preferable to\n", "replace the output layer with a Dense layer: it would run slightly faster, the accuracy would be roughly the same, and it would allow us to choose any output activation function we want. If you make this change, also make sure to remove `return_sequences=True` from the second (now last) recurrent layer" ] }, { "cell_type": "code", "execution_count": 24, "metadata": { "id": "nT3gQCrq5jk9", "outputId": "2aa5a397-0474-4438-9733-360a6ecc9278", "colab": { "base_uri": "https://localhost:8080/" } }, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "Model: \"sequential_3\"\n", "_________________________________________________________________\n", " Layer (type) Output Shape Param # \n", "=================================================================\n", " simple_rnn_4 (SimpleRNN) (None, None, 20) 440 \n", " \n", " simple_rnn_5 (SimpleRNN) (None, 20) 820 \n", " \n", " dense_1 (Dense) (None, 1) 21 \n", " \n", "=================================================================\n", "Total params: 1,281\n", "Trainable params: 1,281\n", "Non-trainable params: 0\n", "_________________________________________________________________\n" ] } ], "source": [ "np.random.seed(42)\n", "tf.random.set_seed(42)\n", "\n", "model = keras.models.Sequential([\n", " keras.layers.SimpleRNN(20, return_sequences=True, input_shape=[None, 1]),\n", " keras.layers.SimpleRNN(20),\n", " keras.layers.Dense(1)\n", "])\n", "\n", "model.summary()" ] }, { "cell_type": "markdown", "metadata": { "id": "IXZoG7QA5jk-" }, "source": [ "## Forecasting Several Steps Ahead" ] }, { "cell_type": "markdown", "source": [ "So far we have only predicted the value at the next time step, but we could just as easily have predicted the value several steps ahead by changing the targets appropriately (e.g., to predict 10 steps ahead, just change the targets\n", "to be the value 10 steps ahead instead of 1 step ahead). But what if we want to predict the next 10 values?\n", "\n", "The first option is to use the model we already trained, make it predict the next value, then add that value to the inputs (acting as if this predicted value had actually occurred), and use the model again to predict the following\n", "value, and so on, as in the following code:" ], "metadata": { "id": "cg0V6AMCJ8Zg" } }, { "cell_type": "code", "execution_count": 25, "metadata": { "id": "aEXPkSXD5jk-" }, "outputs": [], "source": [ "np.random.seed(43) # not 42, as it would give the first series in the train set\n", "\n", "series = generate_time_series(1, n_steps + 10)\n", "X_new, Y_new = series[:, :n_steps], series[:, n_steps:]\n", "X = X_new\n", "for step_ahead in range(10):\n", " y_pred_one = model.predict(X[:, step_ahead:])[:, np.newaxis, :]\n", " X = np.concatenate([X, y_pred_one], axis=1)\n", "\n", "Y_pred = X[:, n_steps:]" ] }, { "cell_type": "code", "execution_count": 26, "metadata": { "id": "2QT45Nd85jk-", "outputId": "4b717995-5a87-43c0-8814-4c06a28b59a3", "colab": { "base_uri": "https://localhost:8080/" } }, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "(1, 10, 1)" ] }, "metadata": {}, "execution_count": 26 } ], "source": [ "Y_pred.shape" ] }, { "cell_type": "code", "execution_count": 27, "metadata": { "id": "sJ_ocxoY5jk-", "outputId": "c614333d-4fdd-41fe-97ea-a9d5fe9577a4", "colab": { "base_uri": "https://localhost:8080/", "height": 314 } }, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "Saving figure forecast_ahead_plot\n" ] }, { "output_type": "display_data", "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" } } ], "source": [ "def plot_multiple_forecasts(X, Y, Y_pred):\n", " n_steps = X.shape[1]\n", " ahead = Y.shape[1]\n", " plot_series(X[0, :, 0])\n", " plt.plot(np.arange(n_steps, n_steps + ahead), Y[0, :, 0], \"bo-\", label=\"Actual\")\n", " plt.plot(np.arange(n_steps, n_steps + ahead), Y_pred[0, :, 0], \"rx-\", label=\"Forecast\", markersize=10)\n", " plt.axis([0, n_steps + ahead, -1, 1])\n", " plt.legend(fontsize=14)\n", "\n", "plot_multiple_forecasts(X_new, Y_new, Y_pred)\n", "save_fig(\"forecast_ahead_plot\")\n", "plt.show()" ] }, { "cell_type": "markdown", "source": [ "As you might expect, the prediction for the next step will usually be more accurate than the predictions for later time steps, since the errors might accumulate. If you only want to forecast a few time steps ahead, on more\n", "complex tasks, this approach may work well." ], "metadata": { "id": "SkRQp2eMLsDS" } }, { "cell_type": "code", "execution_count": 28, "metadata": { "id": "r3mvSba_5jk-" }, "outputs": [], "source": [ "np.random.seed(42)\n", "\n", "n_steps = 50\n", "series = generate_time_series(10000, n_steps + 10)\n", "X_train, Y_train = series[:7000, :n_steps], series[:7000, -10:, 0]\n", "X_valid, Y_valid = series[7000:9000, :n_steps], series[7000:9000, -10:, 0]\n", "X_test, Y_test = series[9000:, :n_steps], series[9000:, -10:, 0]" ] }, { "cell_type": "markdown", "metadata": { "id": "HDbQm3hU5jk-" }, "source": [ "Now let's predict the next 10 values one by one:" ] }, { "cell_type": "code", "execution_count": 29, "metadata": { "id": "Nb9XFiww5jk_" }, "outputs": [], "source": [ "X = X_valid\n", "for step_ahead in range(10):\n", " y_pred_one = model.predict(X)[:, np.newaxis, :]\n", " X = np.concatenate([X, y_pred_one], axis=1)\n", "\n", "Y_pred = X[:, n_steps:, 0]" ] }, { "cell_type": "code", "execution_count": 30, "metadata": { "id": "ioAnpOO75jk_", "outputId": "e2c25792-d471-4410-e9c7-639ea3b9bc71", "colab": { "base_uri": "https://localhost:8080/" } }, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "(2000, 10)" ] }, "metadata": {}, "execution_count": 30 } ], "source": [ "Y_pred.shape" ] }, { "cell_type": "code", "execution_count": 31, "metadata": { "id": "JSeX3I_y5jk_", "outputId": "9c0fe992-efbc-4dd4-cf9b-c6e34d711664", "colab": { "base_uri": "https://localhost:8080/" } }, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "0.19622587" ] }, "metadata": {}, "execution_count": 31 } ], "source": [ "np.mean(keras.metrics.mean_squared_error(Y_valid, Y_pred))" ] }, { "cell_type": "markdown", "metadata": { "id": "mLeMjfsH5jk_" }, "source": [ "Let's compare this performance with some baselines: naive predictions:" ] }, { "cell_type": "code", "execution_count": 32, "metadata": { "id": "oYYPT19r5jk_", "outputId": "3c3443e1-ff3a-4da8-da44-54a5c20273c1", "colab": { "base_uri": "https://localhost:8080/" } }, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "0.25697407" ] }, "metadata": {}, "execution_count": 32 } ], "source": [ "Y_naive_pred = np.tile(X_valid[:, -1], 10) # take the last time step value, and repeat it 10 times\n", "np.mean(keras.metrics.mean_squared_error(Y_valid, Y_naive_pred))" ] }, { "cell_type": "markdown", "metadata": { "id": "Hb7kFoGB5jk_" }, "source": [ "The second option is to train an RNN to predict all 10 next values at once. We can still use a sequence-to-vector model, but it will output 10 values instead of 1. Now we just need the output layer to have 10 units instead of 1:" ] }, { "cell_type": "code", "source": [ "np.random.seed(42)\n", "tf.random.set_seed(42)\n", "\n", "model = keras.models.Sequential([\n", " keras.layers.SimpleRNN(20, return_sequences=True, input_shape=[None, 1]),\n", " keras.layers.SimpleRNN(20),\n", " keras.layers.Dense(10)\n", "])\n", "\n", "model.summary()" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "3JAX6ywcMpTZ", "outputId": "094bf7d4-fef8-4c70-faec-59eadcab1aba" }, "execution_count": 34, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "Model: \"sequential_4\"\n", "_________________________________________________________________\n", " Layer (type) Output Shape Param # \n", "=================================================================\n", " simple_rnn_6 (SimpleRNN) (None, None, 20) 440 \n", " \n", " simple_rnn_7 (SimpleRNN) (None, 20) 820 \n", " \n", " dense_2 (Dense) (None, 10) 210 \n", " \n", "=================================================================\n", "Total params: 1,470\n", "Trainable params: 1,470\n", "Non-trainable params: 0\n", "_________________________________________________________________\n" ] } ] }, { "cell_type": "code", "execution_count": 33, "metadata": { "id": "v6AHQGbF5jk_", "outputId": "8bacd10d-4c87-41c4-a0b1-42ac4eb4c5fa", "colab": { "base_uri": "https://localhost:8080/" } }, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "Epoch 1/20\n", "219/219 [==============================] - 34s 149ms/step - loss: 0.0669 - val_loss: 0.0317\n", "Epoch 2/20\n", "219/219 [==============================] - 36s 164ms/step - loss: 0.0265 - val_loss: 0.0200\n", "Epoch 3/20\n", "219/219 [==============================] - 25s 112ms/step - loss: 0.0183 - val_loss: 0.0160\n", "Epoch 4/20\n", "219/219 [==============================] - 25s 112ms/step - loss: 0.0155 - val_loss: 0.0144\n", "Epoch 5/20\n", "219/219 [==============================] - 25s 114ms/step - loss: 0.0139 - val_loss: 0.0118\n", "Epoch 6/20\n", "219/219 [==============================] - 36s 167ms/step - loss: 0.0128 - val_loss: 0.0112\n", "Epoch 7/20\n", "219/219 [==============================] - 25s 115ms/step - loss: 0.0122 - val_loss: 0.0110\n", "Epoch 8/20\n", "219/219 [==============================] - 25s 114ms/step - loss: 0.0115 - val_loss: 0.0103\n", "Epoch 9/20\n", "219/219 [==============================] - 25s 112ms/step - loss: 0.0111 - val_loss: 0.0112\n", "Epoch 10/20\n", "219/219 [==============================] - 24s 111ms/step - loss: 0.0110 - val_loss: 0.0100\n", "Epoch 11/20\n", "219/219 [==============================] - 25s 116ms/step - loss: 0.0108 - val_loss: 0.0103\n", "Epoch 12/20\n", "219/219 [==============================] - 26s 120ms/step - loss: 0.0102 - val_loss: 0.0096\n", "Epoch 13/20\n", "219/219 [==============================] - 43s 195ms/step - loss: 0.0104 - val_loss: 0.0100\n", "Epoch 14/20\n", "219/219 [==============================] - 25s 112ms/step - loss: 0.0098 - val_loss: 0.0103\n", "Epoch 15/20\n", "219/219 [==============================] - 35s 161ms/step - loss: 0.0095 - val_loss: 0.0107\n", "Epoch 16/20\n", "219/219 [==============================] - 25s 112ms/step - loss: 0.0092 - val_loss: 0.0089\n", "Epoch 17/20\n", "219/219 [==============================] - 25s 113ms/step - loss: 0.0094 - val_loss: 0.0111\n", "Epoch 18/20\n", "219/219 [==============================] - 25s 113ms/step - loss: 0.0095 - val_loss: 0.0094\n", "Epoch 19/20\n", "219/219 [==============================] - 25s 116ms/step - loss: 0.0093 - val_loss: 0.0083\n", "Epoch 20/20\n", "219/219 [==============================] - 25s 113ms/step - loss: 0.0094 - val_loss: 0.0085\n" ] } ], "source": [ "model.compile(loss=\"mse\", optimizer=\"adam\")\n", "history = model.fit(X_train, Y_train, epochs=20,\n", " validation_data=(X_valid, Y_valid))" ] }, { "cell_type": "code", "execution_count": 35, "metadata": { "id": "CTOKPakr5jlA" }, "outputs": [], "source": [ "np.random.seed(43)\n", "\n", "series = generate_time_series(1, 50 + 10)\n", "X_new, Y_new = series[:, :50, :], series[:, -10:, :]\n", "Y_pred = model.predict(X_new)[..., np.newaxis]" ] }, { "cell_type": "code", "execution_count": 36, "metadata": { "id": "J2sKchZu5jlA", "outputId": "8ac7905d-f1cd-4f9b-e2e4-742405e7668a", "colab": { "base_uri": "https://localhost:8080/", "height": 293 } }, "outputs": [ { "output_type": "display_data", "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" } } ], "source": [ "plot_multiple_forecasts(X_new, Y_new, Y_pred)\n", "plt.show()" ] }, { "cell_type": "markdown", "source": [ "This model works nicely: the MSE for the next 10 time steps is about 0.008. That’s much better than predicted one by one. But we can still do better: indeed, instead of training the model to forecast the next 10 values only at the\n", "very last time step, **we can train it to forecast the next 10 values at each and every time step.** In other words, we can turn this sequence-to-vector RNN into a sequence-to-sequence RNN. The advantage of this technique is that the loss will contain a term for the output of the RNN at each and every time step, not just the output at the last time step. This means there will be many more error gradients flowing through the model, and they won’t have to flow only through time; they will also flow from the output of each time step. This will both stabilize and speed up training.\n", "\n", "Now let's create an RNN that predicts the next 10 steps at each time step. That is, instead of just forecasting time steps 50 to 59 based on time steps 0 to 49, it will forecast time steps 1 to 10 at time step 0, then time steps 2 to 11 at time step 1, and so on, and finally it will forecast time steps 50 to 59 at the last time step. So each target must be a sequence of the same\n", "length as the input sequence, containing a 10-dimensional vector at each step. Let’s prepare these target sequences:" ], "metadata": { "id": "2lxUhMIvNId7" } }, { "cell_type": "code", "execution_count": 37, "metadata": { "id": "sRRkwZGd5jlA" }, "outputs": [], "source": [ "np.random.seed(42)\n", "\n", "# Notice that the model is still causal: when it makes predictions at \n", "# any time step, it can only see past time steps.\n", "\n", "n_steps = 50\n", "series = generate_time_series(10000, n_steps + 10)\n", "X_train = series[:7000, :n_steps]\n", "X_valid = series[7000:9000, :n_steps]\n", "X_test = series[9000:, :n_steps]\n", "Y = np.empty((10000, n_steps, 10)) # each target is a sequence of 10D vectors\n", "for step_ahead in range(1, 10 + 1):\n", " Y[:,:, step_ahead - 1] = series[:, step_ahead:step_ahead + n_steps, 0]\n", "Y_train = Y[:7000]\n", "Y_valid = Y[7000:9000]\n", "Y_test = Y[9000:]" ] }, { "cell_type": "code", "execution_count": 41, "metadata": { "id": "G0w2fo8M5jlA", "outputId": "95d507e9-1197-4ccb-d732-ae536a6f508d", "colab": { "base_uri": "https://localhost:8080/" } }, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "((7000, 50, 1), (7000, 50, 10), (10000, 60, 1))" ] }, "metadata": {}, "execution_count": 41 } ], "source": [ "X_train.shape, Y_train.shape, series.shape" ] }, { "cell_type": "markdown", "source": [ "To turn the model into a sequence-to-sequence model, we must set `return_sequences=True` in all recurrent layers (even the last one), and we must **apply the output Dense layer at every time step**. Keras offers a\n", "`TimeDistributed` layer for this very purpose: it wraps any layer (e.g., a Dense layer) and applies it at every time step of its input sequence. It does this efficiently, by reshaping the inputs so that each time step is treated as a\n", "separate instance (i.e., it reshapes the inputs from `[batch size, time steps, input dimensions]` to `[batch size×time steps, input dimensions]`. In this example, the number of input dimensions is 20 because the previous SimpleRNN\n", "layer has 20 units), then it runs the Dense layer, and finally it reshapes the outputs back to sequences (i.e., it reshapes the outputs from `[batch size × time steps, output dimensions]` to `[batch size, time steps, output dimensions]`; in this example the number of output dimensions is 10, since the Dense layer has 10 units). Here is the updated model:" ], "metadata": { "id": "QSdLOYW3PFgj" } }, { "cell_type": "code", "execution_count": 39, "metadata": { "id": "lkFg1r3-5jlA", "outputId": "425807e6-1b3c-45b2-e8a9-09f18f961983", "colab": { "base_uri": "https://localhost:8080/" } }, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "Model: \"sequential_5\"\n", "_________________________________________________________________\n", " Layer (type) Output Shape Param # \n", "=================================================================\n", " simple_rnn_8 (SimpleRNN) (None, None, 20) 440 \n", " \n", " simple_rnn_9 (SimpleRNN) (None, None, 20) 820 \n", " \n", " time_distributed (TimeDistr (None, None, 10) 210 \n", " ibuted) \n", " \n", "=================================================================\n", "Total params: 1,470\n", "Trainable params: 1,470\n", "Non-trainable params: 0\n", "_________________________________________________________________\n" ] } ], "source": [ "np.random.seed(42)\n", "tf.random.set_seed(42)\n", "\n", "model = keras.models.Sequential([\n", " keras.layers.SimpleRNN(20, return_sequences=True, input_shape=[None, 1]),\n", " keras.layers.SimpleRNN(20, return_sequences=True),\n", " keras.layers.TimeDistributed(keras.layers.Dense(10))\n", "])\n", "\n", "model.summary()" ] }, { "cell_type": "markdown", "source": [ "It makes it clear that the Dense layer is applied independently at each time step and that the model will output a sequence, not just a single vector.\n", "\n", "All outputs are needed during training, but only the output at the last time step is useful for predictions and for evaluation. So although we will rely on the MSE over all the outputs for training, we will use a custom metric for\n", "evaluation, to only compute the MSE over the output at the last time step:" ], "metadata": { "id": "835XgNffQAVq" } }, { "cell_type": "code", "source": [ "def last_time_step_mse(Y_true, Y_pred):\n", " return keras.metrics.mean_squared_error(Y_true[:, -1], Y_pred[:, -1])\n", "\n", "model.compile(loss=\"mse\", optimizer=keras.optimizers.Adam(learning_rate=0.01), metrics=[last_time_step_mse])\n", "history = model.fit(X_train, Y_train, epochs=20,\n", " validation_data=(X_valid, Y_valid))" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "ZfljBHwIPzVS", "outputId": "1b640b93-6771-4644-d48d-c7e1f76f2ce5" }, "execution_count": 40, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "Epoch 1/20\n", "219/219 [==============================] - 27s 115ms/step - loss: 0.0508 - last_time_step_mse: 0.0400 - val_loss: 0.0429 - val_last_time_step_mse: 0.0324\n", "Epoch 2/20\n", "219/219 [==============================] - 24s 112ms/step - loss: 0.0395 - last_time_step_mse: 0.0283 - val_loss: 0.0352 - val_last_time_step_mse: 0.0244\n", "Epoch 3/20\n", "219/219 [==============================] - 25s 112ms/step - loss: 0.0327 - last_time_step_mse: 0.0215 - val_loss: 0.0314 - val_last_time_step_mse: 0.0208\n", "Epoch 4/20\n", "219/219 [==============================] - 26s 118ms/step - loss: 0.0295 - last_time_step_mse: 0.0181 - val_loss: 0.0275 - val_last_time_step_mse: 0.0157\n", "Epoch 5/20\n", "219/219 [==============================] - 24s 112ms/step - loss: 0.0272 - last_time_step_mse: 0.0152 - val_loss: 0.0286 - val_last_time_step_mse: 0.0203\n", "Epoch 6/20\n", "219/219 [==============================] - 24s 111ms/step - loss: 0.0249 - last_time_step_mse: 0.0123 - val_loss: 0.0230 - val_last_time_step_mse: 0.0094\n", "Epoch 7/20\n", "219/219 [==============================] - 24s 112ms/step - loss: 0.0228 - last_time_step_mse: 0.0100 - val_loss: 0.0220 - val_last_time_step_mse: 0.0086\n", "Epoch 8/20\n", "219/219 [==============================] - 24s 111ms/step - loss: 0.0215 - last_time_step_mse: 0.0085 - val_loss: 0.0221 - val_last_time_step_mse: 0.0098\n", "Epoch 9/20\n", "219/219 [==============================] - 24s 111ms/step - loss: 0.0212 - last_time_step_mse: 0.0085 - val_loss: 0.0206 - val_last_time_step_mse: 0.0073\n", "Epoch 10/20\n", "219/219 [==============================] - 25s 112ms/step - loss: 0.0210 - last_time_step_mse: 0.0084 - val_loss: 0.0197 - val_last_time_step_mse: 0.0071\n", "Epoch 11/20\n", "219/219 [==============================] - 24s 110ms/step - loss: 0.0205 - last_time_step_mse: 0.0081 - val_loss: 0.0200 - val_last_time_step_mse: 0.0076\n", "Epoch 12/20\n", "219/219 [==============================] - 24s 111ms/step - loss: 0.0203 - last_time_step_mse: 0.0083 - val_loss: 0.0202 - val_last_time_step_mse: 0.0086\n", "Epoch 13/20\n", "219/219 [==============================] - 25s 112ms/step - loss: 0.0197 - last_time_step_mse: 0.0074 - val_loss: 0.0208 - val_last_time_step_mse: 0.0082\n", "Epoch 14/20\n", "219/219 [==============================] - 25s 114ms/step - loss: 0.0192 - last_time_step_mse: 0.0070 - val_loss: 0.0192 - val_last_time_step_mse: 0.0070\n", "Epoch 15/20\n", "219/219 [==============================] - 25s 114ms/step - loss: 0.0191 - last_time_step_mse: 0.0071 - val_loss: 0.0185 - val_last_time_step_mse: 0.0071\n", "Epoch 16/20\n", "219/219 [==============================] - 24s 112ms/step - loss: 0.0189 - last_time_step_mse: 0.0069 - val_loss: 0.0191 - val_last_time_step_mse: 0.0085\n", "Epoch 17/20\n", "219/219 [==============================] - 24s 110ms/step - loss: 0.0188 - last_time_step_mse: 0.0069 - val_loss: 0.0185 - val_last_time_step_mse: 0.0070\n", "Epoch 18/20\n", "219/219 [==============================] - 24s 110ms/step - loss: 0.0186 - last_time_step_mse: 0.0069 - val_loss: 0.0181 - val_last_time_step_mse: 0.0066\n", "Epoch 19/20\n", "219/219 [==============================] - 24s 112ms/step - loss: 0.0185 - last_time_step_mse: 0.0069 - val_loss: 0.0176 - val_last_time_step_mse: 0.0062\n", "Epoch 20/20\n", "219/219 [==============================] - 24s 109ms/step - loss: 0.0185 - last_time_step_mse: 0.0070 - val_loss: 0.0192 - val_last_time_step_mse: 0.0077\n" ] } ] }, { "cell_type": "code", "execution_count": 42, "metadata": { "id": "LT0dimq-5jlA" }, "outputs": [], "source": [ "np.random.seed(43)\n", "\n", "series = generate_time_series(1, 50 + 10)\n", "X_new, Y_new = series[:, :50, :], series[:, 50:, :]\n", "Y_pred = model.predict(X_new)[:, -1][..., np.newaxis]" ] }, { "cell_type": "code", "source": [ "model.evaluate(X_valid, Y_valid)" ], "metadata": { "id": "7-GHQBIcZR31" }, "execution_count": null, "outputs": [] }, { "cell_type": "code", "execution_count": 43, "metadata": { "id": "DICzAf4-5jlB", "outputId": "7b867ee3-336f-4d78-9ac7-068ace0a8fc2", "colab": { "base_uri": "https://localhost:8080/", "height": 293 } }, "outputs": [ { "output_type": "display_data", "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" } } ], "source": [ "plot_multiple_forecasts(X_new, Y_new, Y_pred)\n", "plt.show()" ] }, { "cell_type": "markdown", "source": [ "You might find https://www.tensorflow.org/api_docs/python/tf/keras/utils/timeseries_dataset_from_array or https://www.tensorflow.org/api_docs/python/tf/keras/preprocessing/sequence/TimeseriesGenerator useful" ], "metadata": { "id": "C2xRfWneMqEu" } }, { "cell_type": "markdown", "metadata": { "id": "rAKE21VC5jlB" }, "source": [ "## Deep RNNs with Layer Norm" ] }, { "cell_type": "markdown", "source": [ "Let’s use tf.keras to implement Layer Normalization within a simple memory cell. We need to define a custom memory cell. It is just like a regular layer, except its `call()` method takes two arguments: the inputs at the current time step and the hidden states from the previous time step. Note that the states argument is a list containing one or more tensors. In the case of a simple RNN cell it contains a single tensor equal to the outputs of the previous time step, but other cells may have multiple state tensors (e.g., an LSTMCell has a long-term state and a short-term state). A cell must also have a `state_size` attribute and an `output_size` attribute. In a simple RNN, both are simply equal to the number of units. The following code implements a custom memory cell which will behave like a SimpleRNNCell, except it will also apply Layer Normalization at each time step:" ], "metadata": { "id": "QnIADsRAU6lL" } }, { "cell_type": "code", "execution_count": 44, "metadata": { "id": "BQ5QfPbp5jlB" }, "outputs": [], "source": [ "from tensorflow.keras.layers import LayerNormalization" ] }, { "cell_type": "code", "execution_count": 46, "metadata": { "id": "tMtn5tDX5jlB" }, "outputs": [], "source": [ "class LNSimpleRNNCell(keras.layers.Layer):\n", " def __init__(self, units, activation=\"tanh\", **kwargs):\n", " super().__init__(**kwargs)\n", " self.state_size = units\n", " self.output_size = units\n", " self.simple_rnn_cell = keras.layers.SimpleRNNCell(units,\n", " activation=None)\n", " self.layer_norm = LayerNormalization()\n", " self.activation = keras.activations.get(activation)\n", " def get_initial_state(self, inputs=None, batch_size=None, dtype=None):\n", " if inputs is not None:\n", " batch_size = tf.shape(inputs)[0]\n", " dtype = inputs.dtype\n", " return [tf.zeros([batch_size, self.state_size], dtype=dtype)]\n", " def call(self, inputs, states):\n", " outputs, new_states = self.simple_rnn_cell(inputs, states)\n", " # in a SimpleRNNCell, the outputs are just equal to the hidden states: new_states[0] is equal to outputs, \n", " # so we can safely ignore new_states in the rest of the call() method.\n", " norm_outputs = self.activation(self.layer_norm(outputs))\n", " return norm_outputs, [norm_outputs]" ] }, { "cell_type": "markdown", "source": [ "Similarly, you could create a custom cell to apply dropout between each time step. But there’s a simpler way: all recurrent layers and all cells provided by Keras have a dropout hyperparameter and a recurrent_dropout hyperparameter: the **former defines the dropout rate to apply to the inputs** (at each time step), and the latter defines the **dropout rate for the hidden states** (also at each time step)." ], "metadata": { "id": "K7qg9UpqVdDd" } }, { "cell_type": "code", "execution_count": 47, "metadata": { "id": "jZa2bF8R5jlB", "outputId": "0f7d7af8-1726-4807-9f24-91b9c22e7e2e", "colab": { "base_uri": "https://localhost:8080/" } }, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "Epoch 1/20\n", "219/219 [==============================] - 74s 317ms/step - loss: 0.1668 - last_time_step_mse: 0.1624 - val_loss: 0.0738 - val_last_time_step_mse: 0.0686\n", "Epoch 2/20\n", "219/219 [==============================] - 56s 254ms/step - loss: 0.0639 - last_time_step_mse: 0.0567 - val_loss: 0.0569 - val_last_time_step_mse: 0.0498\n", "Epoch 3/20\n", "219/219 [==============================] - 55s 251ms/step - loss: 0.0538 - last_time_step_mse: 0.0462 - val_loss: 0.0504 - val_last_time_step_mse: 0.0428\n", "Epoch 4/20\n", "219/219 [==============================] - 74s 340ms/step - loss: 0.0472 - last_time_step_mse: 0.0386 - val_loss: 0.0444 - val_last_time_step_mse: 0.0358\n", "Epoch 5/20\n", "219/219 [==============================] - 82s 376ms/step - loss: 0.0416 - last_time_step_mse: 0.0316 - val_loss: 0.0390 - val_last_time_step_mse: 0.0284\n", "Epoch 6/20\n", "219/219 [==============================] - 54s 249ms/step - loss: 0.0384 - last_time_step_mse: 0.0274 - val_loss: 0.0367 - val_last_time_step_mse: 0.0256\n", "Epoch 7/20\n", "219/219 [==============================] - 56s 256ms/step - loss: 0.0364 - last_time_step_mse: 0.0254 - val_loss: 0.0349 - val_last_time_step_mse: 0.0239\n", "Epoch 8/20\n", "219/219 [==============================] - 65s 298ms/step - loss: 0.0346 - last_time_step_mse: 0.0233 - val_loss: 0.0341 - val_last_time_step_mse: 0.0229\n", "Epoch 9/20\n", "219/219 [==============================] - 55s 251ms/step - loss: 0.0335 - last_time_step_mse: 0.0221 - val_loss: 0.0334 - val_last_time_step_mse: 0.0227\n", "Epoch 10/20\n", "219/219 [==============================] - 69s 315ms/step - loss: 0.0323 - last_time_step_mse: 0.0207 - val_loss: 0.0313 - val_last_time_step_mse: 0.0192\n", "Epoch 11/20\n", "219/219 [==============================] - 72s 331ms/step - loss: 0.0314 - last_time_step_mse: 0.0194 - val_loss: 0.0300 - val_last_time_step_mse: 0.0180\n", "Epoch 12/20\n", "219/219 [==============================] - 57s 262ms/step - loss: 0.0311 - last_time_step_mse: 0.0192 - val_loss: 0.0301 - val_last_time_step_mse: 0.0184\n", "Epoch 13/20\n", "219/219 [==============================] - 70s 318ms/step - loss: 0.0299 - last_time_step_mse: 0.0178 - val_loss: 0.0289 - val_last_time_step_mse: 0.0168\n", "Epoch 14/20\n", "219/219 [==============================] - 56s 255ms/step - loss: 0.0295 - last_time_step_mse: 0.0173 - val_loss: 0.0288 - val_last_time_step_mse: 0.0167\n", "Epoch 15/20\n", "219/219 [==============================] - 67s 308ms/step - loss: 0.0291 - last_time_step_mse: 0.0170 - val_loss: 0.0289 - val_last_time_step_mse: 0.0167\n", "Epoch 16/20\n", "219/219 [==============================] - 55s 250ms/step - loss: 0.0288 - last_time_step_mse: 0.0168 - val_loss: 0.0280 - val_last_time_step_mse: 0.0160\n", "Epoch 17/20\n", "219/219 [==============================] - 57s 260ms/step - loss: 0.0285 - last_time_step_mse: 0.0164 - val_loss: 0.0284 - val_last_time_step_mse: 0.0164\n", "Epoch 18/20\n", "219/219 [==============================] - 56s 257ms/step - loss: 0.0280 - last_time_step_mse: 0.0158 - val_loss: 0.0275 - val_last_time_step_mse: 0.0154\n", "Epoch 19/20\n", "219/219 [==============================] - 54s 248ms/step - loss: 0.0280 - last_time_step_mse: 0.0159 - val_loss: 0.0278 - val_last_time_step_mse: 0.0150\n", "Epoch 20/20\n", "219/219 [==============================] - 55s 253ms/step - loss: 0.0276 - last_time_step_mse: 0.0154 - val_loss: 0.0276 - val_last_time_step_mse: 0.0156\n" ] } ], "source": [ "np.random.seed(42)\n", "tf.random.set_seed(42)\n", "\n", "model = keras.models.Sequential([\n", " keras.layers.RNN(LNSimpleRNNCell(20), return_sequences=True,\n", " input_shape=[None, 1]),\n", " keras.layers.RNN(LNSimpleRNNCell(20), return_sequences=True),\n", " keras.layers.TimeDistributed(keras.layers.Dense(10))\n", "])\n", "\n", "model.compile(loss=\"mse\", optimizer=\"adam\", metrics=[last_time_step_mse])\n", "history = model.fit(X_train, Y_train, epochs=20,\n", " validation_data=(X_valid, Y_valid))" ] }, { "cell_type": "code", "source": [ "model.evaluate(X_valid, Y_valid)" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "4EfwMTdzZOMf", "outputId": "86db371a-a6da-4814-b7ee-d4e7a30671ec" }, "execution_count": 49, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "63/63 [==============================] - 2s 24ms/step - loss: 0.0276 - last_time_step_mse: 0.0156\n" ] }, { "output_type": "execute_result", "data": { "text/plain": [ "[0.027597270905971527, 0.015594885684549809]" ] }, "metadata": {}, "execution_count": 49 } ] }, { "cell_type": "markdown", "metadata": { "id": "3QR4Q74s5jlC" }, "source": [ "## LSTMs" ] }, { "cell_type": "markdown", "source": [ "In Keras, you can simply use the LSTM layer instead of the SimpleRNN layer:" ], "metadata": { "id": "6VSPGUazWzI1" } }, { "cell_type": "code", "execution_count": 50, "metadata": { "scrolled": true, "id": "hFClcQIw5jlC", "outputId": "9286bdac-5599-4c79-e1c7-7db3b147024e", "colab": { "base_uri": "https://localhost:8080/" } }, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "Model: \"sequential_7\"\n", "_________________________________________________________________\n", " Layer (type) Output Shape Param # \n", "=================================================================\n", " lstm (LSTM) (None, None, 20) 1760 \n", " \n", " lstm_1 (LSTM) (None, None, 20) 3280 \n", " \n", " time_distributed_2 (TimeDis (None, None, 10) 210 \n", " tributed) \n", " \n", "=================================================================\n", "Total params: 5,250\n", "Trainable params: 5,250\n", "Non-trainable params: 0\n", "_________________________________________________________________\n" ] } ], "source": [ "np.random.seed(42)\n", "tf.random.set_seed(42)\n", "\n", "model = keras.models.Sequential([\n", " keras.layers.LSTM(20, return_sequences=True, input_shape=[None, 1]),\n", " keras.layers.LSTM(20, return_sequences=True),\n", " keras.layers.TimeDistributed(keras.layers.Dense(10))\n", "])\n", "\n", "model.summary()" ] }, { "cell_type": "code", "source": [ "# keras.layers.RNN(keras.layers.LSTMCell(20), return_sequences=True, input_shape=[None, 1]) also works\n", "# However, the LSTM layer uses an optimized implementation when running on a GPU\n", "# RNN layer is mostly useful when you define custom cells, as we did earl" ], "metadata": { "id": "-YL5WRMMXxvr" }, "execution_count": 48, "outputs": [] }, { "cell_type": "code", "source": [ "model.compile(loss=\"mse\", optimizer=\"adam\", metrics=[last_time_step_mse])\n", "history = model.fit(X_train, Y_train, epochs=20,\n", " validation_data=(X_valid, Y_valid))" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "LdL-BteUXio0", "outputId": "e9748267-a240-43d3-c18b-a0c67924364d" }, "execution_count": 51, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "Epoch 1/20\n", "219/219 [==============================] - 10s 27ms/step - loss: 0.0760 - last_time_step_mse: 0.0615 - val_loss: 0.0554 - val_last_time_step_mse: 0.0364\n", "Epoch 2/20\n", "219/219 [==============================] - 5s 23ms/step - loss: 0.0480 - last_time_step_mse: 0.0283 - val_loss: 0.0427 - val_last_time_step_mse: 0.0222\n", "Epoch 3/20\n", "219/219 [==============================] - 5s 23ms/step - loss: 0.0391 - last_time_step_mse: 0.0181 - val_loss: 0.0367 - val_last_time_step_mse: 0.0157\n", "Epoch 4/20\n", "219/219 [==============================] - 5s 23ms/step - loss: 0.0350 - last_time_step_mse: 0.0151 - val_loss: 0.0334 - val_last_time_step_mse: 0.0132\n", "Epoch 5/20\n", "219/219 [==============================] - 5s 23ms/step - loss: 0.0325 - last_time_step_mse: 0.0133 - val_loss: 0.0314 - val_last_time_step_mse: 0.0121\n", "Epoch 6/20\n", "219/219 [==============================] - 5s 23ms/step - loss: 0.0308 - last_time_step_mse: 0.0122 - val_loss: 0.0298 - val_last_time_step_mse: 0.0112\n", "Epoch 7/20\n", "219/219 [==============================] - 5s 23ms/step - loss: 0.0297 - last_time_step_mse: 0.0118 - val_loss: 0.0291 - val_last_time_step_mse: 0.0120\n", "Epoch 8/20\n", "219/219 [==============================] - 5s 23ms/step - loss: 0.0286 - last_time_step_mse: 0.0109 - val_loss: 0.0278 - val_last_time_step_mse: 0.0099\n", "Epoch 9/20\n", "219/219 [==============================] - 5s 23ms/step - loss: 0.0280 - last_time_step_mse: 0.0108 - val_loss: 0.0278 - val_last_time_step_mse: 0.0113\n", "Epoch 10/20\n", "219/219 [==============================] - 5s 23ms/step - loss: 0.0273 - last_time_step_mse: 0.0105 - val_loss: 0.0268 - val_last_time_step_mse: 0.0101\n", "Epoch 11/20\n", "219/219 [==============================] - 5s 24ms/step - loss: 0.0269 - last_time_step_mse: 0.0102 - val_loss: 0.0263 - val_last_time_step_mse: 0.0096\n", "Epoch 12/20\n", "219/219 [==============================] - 5s 23ms/step - loss: 0.0264 - last_time_step_mse: 0.0101 - val_loss: 0.0263 - val_last_time_step_mse: 0.0105\n", "Epoch 13/20\n", "219/219 [==============================] - 5s 23ms/step - loss: 0.0259 - last_time_step_mse: 0.0097 - val_loss: 0.0257 - val_last_time_step_mse: 0.0100\n", "Epoch 14/20\n", "219/219 [==============================] - 5s 23ms/step - loss: 0.0257 - last_time_step_mse: 0.0096 - val_loss: 0.0252 - val_last_time_step_mse: 0.0091\n", "Epoch 15/20\n", "219/219 [==============================] - 5s 24ms/step - loss: 0.0253 - last_time_step_mse: 0.0095 - val_loss: 0.0251 - val_last_time_step_mse: 0.0092\n", "Epoch 16/20\n", "219/219 [==============================] - 5s 23ms/step - loss: 0.0251 - last_time_step_mse: 0.0095 - val_loss: 0.0248 - val_last_time_step_mse: 0.0089\n", "Epoch 17/20\n", "219/219 [==============================] - 5s 23ms/step - loss: 0.0248 - last_time_step_mse: 0.0094 - val_loss: 0.0248 - val_last_time_step_mse: 0.0098\n", "Epoch 18/20\n", "219/219 [==============================] - 5s 25ms/step - loss: 0.0245 - last_time_step_mse: 0.0093 - val_loss: 0.0246 - val_last_time_step_mse: 0.0091\n", "Epoch 19/20\n", "219/219 [==============================] - 5s 23ms/step - loss: 0.0242 - last_time_step_mse: 0.0091 - val_loss: 0.0238 - val_last_time_step_mse: 0.0085\n", "Epoch 20/20\n", "219/219 [==============================] - 5s 23ms/step - loss: 0.0239 - last_time_step_mse: 0.0089 - val_loss: 0.0238 - val_last_time_step_mse: 0.0086\n" ] } ] }, { "cell_type": "code", "execution_count": 52, "metadata": { "id": "H-BCnT5d5jlC", "outputId": "b2a3e1da-15b6-4093-dec0-5d20c4f99f66", "colab": { "base_uri": "https://localhost:8080/" } }, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "63/63 [==============================] - 1s 10ms/step - loss: 0.0238 - last_time_step_mse: 0.0086\n" ] }, { "output_type": "execute_result", "data": { "text/plain": [ "[0.023788688704371452, 0.008560807444155216]" ] }, "metadata": {}, "execution_count": 52 } ], "source": [ "model.evaluate(X_valid, Y_valid)" ] }, { "cell_type": "code", "execution_count": 53, "metadata": { "id": "dHvxuwIw5jlC", "outputId": "8e240048-c556-4fed-84a9-e7a1eff224af", "colab": { "base_uri": "https://localhost:8080/", "height": 291 } }, "outputs": [ { "output_type": "display_data", "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" } } ], "source": [ "plot_learning_curves(history.history[\"loss\"], history.history[\"val_loss\"])\n", "plt.show()" ] }, { "cell_type": "code", "execution_count": 54, "metadata": { "id": "yJIAA4wP5jlC" }, "outputs": [], "source": [ "np.random.seed(43)\n", "\n", "series = generate_time_series(1, 50 + 10)\n", "X_new, Y_new = series[:, :50, :], series[:, 50:, :]\n", "Y_pred = model.predict(X_new)[:, -1][..., np.newaxis]" ] }, { "cell_type": "code", "execution_count": 55, "metadata": { "scrolled": true, "id": "9Q83vZbA5jlC", "outputId": "fca5256c-633c-415a-8d97-38d8a15b4f6d", "colab": { "base_uri": "https://localhost:8080/", "height": 293 } }, "outputs": [ { "output_type": "display_data", "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" } } ], "source": [ "plot_multiple_forecasts(X_new, Y_new, Y_pred)\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": { "id": "V5kAzvP15jlD" }, "source": [ "## GRUs" ] }, { "cell_type": "code", "source": [ "np.random.seed(42)\n", "tf.random.set_seed(42)\n", "\n", "model = keras.models.Sequential([\n", " keras.layers.GRU(20, return_sequences=True, input_shape=[None, 1]),\n", " keras.layers.GRU(20, return_sequences=True),\n", " keras.layers.TimeDistributed(keras.layers.Dense(10))\n", "])\n", "\n", "model.summary()" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "U0pp1tbVcFEP", "outputId": "3dedff16-df49-459d-aa13-4f607d010e1a" }, "execution_count": 57, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "Model: \"sequential_8\"\n", "_________________________________________________________________\n", " Layer (type) Output Shape Param # \n", "=================================================================\n", " gru (GRU) (None, None, 20) 1380 \n", " \n", " gru_1 (GRU) (None, None, 20) 2520 \n", " \n", " time_distributed_3 (TimeDis (None, None, 10) 210 \n", " tributed) \n", " \n", "=================================================================\n", "Total params: 4,110\n", "Trainable params: 4,110\n", "Non-trainable params: 0\n", "_________________________________________________________________\n" ] } ] }, { "cell_type": "code", "execution_count": 56, "metadata": { "id": "cV-aoeD95jlD", "outputId": "aa2fe19b-4891-4f77-e507-0751f2053dde", "colab": { "base_uri": "https://localhost:8080/" } }, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "Epoch 1/20\n", "219/219 [==============================] - 11s 31ms/step - loss: 0.0738 - last_time_step_mse: 0.0655 - val_loss: 0.0538 - val_last_time_step_mse: 0.0450\n", "Epoch 2/20\n", "219/219 [==============================] - 6s 29ms/step - loss: 0.0476 - last_time_step_mse: 0.0367 - val_loss: 0.0441 - val_last_time_step_mse: 0.0326\n", "Epoch 3/20\n", "219/219 [==============================] - 5s 25ms/step - loss: 0.0417 - last_time_step_mse: 0.0301 - val_loss: 0.0390 - val_last_time_step_mse: 0.0275\n", "Epoch 4/20\n", "219/219 [==============================] - 6s 27ms/step - loss: 0.0368 - last_time_step_mse: 0.0243 - val_loss: 0.0339 - val_last_time_step_mse: 0.0202\n", "Epoch 5/20\n", "219/219 [==============================] - 5s 23ms/step - loss: 0.0326 - last_time_step_mse: 0.0180 - val_loss: 0.0312 - val_last_time_step_mse: 0.0164\n", "Epoch 6/20\n", "219/219 [==============================] - 5s 23ms/step - loss: 0.0306 - last_time_step_mse: 0.0155 - val_loss: 0.0294 - val_last_time_step_mse: 0.0143\n", "Epoch 7/20\n", "219/219 [==============================] - 5s 24ms/step - loss: 0.0295 - last_time_step_mse: 0.0145 - val_loss: 0.0300 - val_last_time_step_mse: 0.0162\n", "Epoch 8/20\n", "219/219 [==============================] - 5s 24ms/step - loss: 0.0283 - last_time_step_mse: 0.0135 - val_loss: 0.0278 - val_last_time_step_mse: 0.0130\n", "Epoch 9/20\n", "219/219 [==============================] - 5s 23ms/step - loss: 0.0276 - last_time_step_mse: 0.0130 - val_loss: 0.0273 - val_last_time_step_mse: 0.0127\n", "Epoch 10/20\n", "219/219 [==============================] - 5s 24ms/step - loss: 0.0269 - last_time_step_mse: 0.0125 - val_loss: 0.0264 - val_last_time_step_mse: 0.0121\n", "Epoch 11/20\n", "219/219 [==============================] - 5s 23ms/step - loss: 0.0265 - last_time_step_mse: 0.0121 - val_loss: 0.0268 - val_last_time_step_mse: 0.0135\n", "Epoch 12/20\n", "219/219 [==============================] - 5s 25ms/step - loss: 0.0263 - last_time_step_mse: 0.0123 - val_loss: 0.0261 - val_last_time_step_mse: 0.0123\n", "Epoch 13/20\n", "219/219 [==============================] - 5s 23ms/step - loss: 0.0258 - last_time_step_mse: 0.0116 - val_loss: 0.0254 - val_last_time_step_mse: 0.0116\n", "Epoch 14/20\n", "219/219 [==============================] - 5s 23ms/step - loss: 0.0256 - last_time_step_mse: 0.0117 - val_loss: 0.0254 - val_last_time_step_mse: 0.0116\n", "Epoch 15/20\n", "219/219 [==============================] - 5s 24ms/step - loss: 0.0253 - last_time_step_mse: 0.0114 - val_loss: 0.0250 - val_last_time_step_mse: 0.0112\n", "Epoch 16/20\n", "219/219 [==============================] - 5s 23ms/step - loss: 0.0251 - last_time_step_mse: 0.0114 - val_loss: 0.0250 - val_last_time_step_mse: 0.0114\n", "Epoch 17/20\n", "219/219 [==============================] - 5s 23ms/step - loss: 0.0248 - last_time_step_mse: 0.0112 - val_loss: 0.0249 - val_last_time_step_mse: 0.0118\n", "Epoch 18/20\n", "219/219 [==============================] - 5s 23ms/step - loss: 0.0245 - last_time_step_mse: 0.0110 - val_loss: 0.0244 - val_last_time_step_mse: 0.0108\n", "Epoch 19/20\n", "219/219 [==============================] - 5s 23ms/step - loss: 0.0243 - last_time_step_mse: 0.0108 - val_loss: 0.0240 - val_last_time_step_mse: 0.0105\n", "Epoch 20/20\n", "219/219 [==============================] - 5s 25ms/step - loss: 0.0240 - last_time_step_mse: 0.0106 - val_loss: 0.0238 - val_last_time_step_mse: 0.0103\n" ] } ], "source": [ "model.compile(loss=\"mse\", optimizer=\"adam\", metrics=[last_time_step_mse])\n", "history = model.fit(X_train, Y_train, epochs=20,\n", " validation_data=(X_valid, Y_valid))" ] }, { "cell_type": "code", "execution_count": 58, "metadata": { "id": "TNOeSZW_5jlD", "outputId": "7e827299-42da-4e52-e341-3ed81d0333e9", "colab": { "base_uri": "https://localhost:8080/" } }, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "63/63 [==============================] - 1s 9ms/step - loss: 0.0238 - last_time_step_mse: 0.0103\n" ] }, { "output_type": "execute_result", "data": { "text/plain": [ "[0.02378549985587597, 0.010262805968523026]" ] }, "metadata": {}, "execution_count": 58 } ], "source": [ "model.evaluate(X_valid, Y_valid)" ] }, { "cell_type": "markdown", "metadata": { "id": "9gDnHqvg5jlD" }, "source": [ "## Using One-Dimensional Convolutional Layers to Process Sequences" ] }, { "cell_type": "markdown", "source": [ "The following model is the same as earlier, except it starts with a 1D convolutional layer that downsamples the input sequence by a factor of 2, using\n", "a stride of 2. By shortening the sequences, the convolutional layer may help the GRU layers detect longer patterns. Note that we must also crop off the first three time steps in the targets (since the kernel’s size is 4, the first output of the convolutional layer will be based on the input time steps 0 to 3), and downsample the targets by a factor of 2:" ], "metadata": { "id": "yTdnZoTLYjGN" } }, { "cell_type": "code", "execution_count": 59, "metadata": { "id": "9PKOidly5jlE", "outputId": "d77c8947-1acc-497e-b7b0-df3dcbfb0fe9", "colab": { "base_uri": "https://localhost:8080/" } }, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "Model: \"sequential_9\"\n", "_________________________________________________________________\n", " Layer (type) Output Shape Param # \n", "=================================================================\n", " conv1d (Conv1D) (None, None, 20) 100 \n", " \n", " gru_2 (GRU) (None, None, 20) 2520 \n", " \n", " gru_3 (GRU) (None, None, 20) 2520 \n", " \n", " time_distributed_4 (TimeDis (None, None, 10) 210 \n", " tributed) \n", " \n", "=================================================================\n", "Total params: 5,350\n", "Trainable params: 5,350\n", "Non-trainable params: 0\n", "_________________________________________________________________\n" ] } ], "source": [ "np.random.seed(42)\n", "tf.random.set_seed(42)\n", "\n", "model = keras.models.Sequential([\n", " keras.layers.Conv1D(filters=20, kernel_size=4, strides=2, padding=\"valid\",\n", " input_shape=[None, 1]),\n", " keras.layers.GRU(20, return_sequences=True),\n", " keras.layers.GRU(20, return_sequences=True),\n", " keras.layers.TimeDistributed(keras.layers.Dense(10))\n", "])\n", "\n", "model.summary()" ] }, { "cell_type": "code", "source": [ "model.compile(loss=\"mse\", optimizer=\"adam\", metrics=[last_time_step_mse])\n", "history = model.fit(X_train, Y_train[:, 3::2], epochs=20,\n", " validation_data=(X_valid, Y_valid[:, 3::2]))" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "kMugaQpAcKFB", "outputId": "2b4564bd-988a-4735-b9a2-c6e029300045" }, "execution_count": 60, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "Epoch 1/20\n", "219/219 [==============================] - 13s 19ms/step - loss: 0.0681 - last_time_step_mse: 0.0601 - val_loss: 0.0477 - val_last_time_step_mse: 0.0396\n", "Epoch 2/20\n", "219/219 [==============================] - 4s 16ms/step - loss: 0.0414 - last_time_step_mse: 0.0340 - val_loss: 0.0367 - val_last_time_step_mse: 0.0285\n", "Epoch 3/20\n", "219/219 [==============================] - 3s 15ms/step - loss: 0.0338 - last_time_step_mse: 0.0257 - val_loss: 0.0307 - val_last_time_step_mse: 0.0218\n", "Epoch 4/20\n", "219/219 [==============================] - 3s 15ms/step - loss: 0.0282 - last_time_step_mse: 0.0184 - val_loss: 0.0259 - val_last_time_step_mse: 0.0152\n", "Epoch 5/20\n", "219/219 [==============================] - 3s 15ms/step - loss: 0.0249 - last_time_step_mse: 0.0143 - val_loss: 0.0246 - val_last_time_step_mse: 0.0141\n", "Epoch 6/20\n", "219/219 [==============================] - 4s 16ms/step - loss: 0.0234 - last_time_step_mse: 0.0125 - val_loss: 0.0227 - val_last_time_step_mse: 0.0115\n", "Epoch 7/20\n", "219/219 [==============================] - 3s 15ms/step - loss: 0.0226 - last_time_step_mse: 0.0117 - val_loss: 0.0225 - val_last_time_step_mse: 0.0116\n", "Epoch 8/20\n", "219/219 [==============================] - 3s 15ms/step - loss: 0.0220 - last_time_step_mse: 0.0111 - val_loss: 0.0216 - val_last_time_step_mse: 0.0105\n", "Epoch 9/20\n", "219/219 [==============================] - 3s 15ms/step - loss: 0.0216 - last_time_step_mse: 0.0108 - val_loss: 0.0217 - val_last_time_step_mse: 0.0109\n", "Epoch 10/20\n", "219/219 [==============================] - 3s 15ms/step - loss: 0.0213 - last_time_step_mse: 0.0106 - val_loss: 0.0210 - val_last_time_step_mse: 0.0102\n", "Epoch 11/20\n", "219/219 [==============================] - 3s 15ms/step - loss: 0.0210 - last_time_step_mse: 0.0102 - val_loss: 0.0208 - val_last_time_step_mse: 0.0100\n", "Epoch 12/20\n", "219/219 [==============================] - 3s 15ms/step - loss: 0.0208 - last_time_step_mse: 0.0102 - val_loss: 0.0208 - val_last_time_step_mse: 0.0102\n", "Epoch 13/20\n", "219/219 [==============================] - 4s 16ms/step - loss: 0.0205 - last_time_step_mse: 0.0098 - val_loss: 0.0206 - val_last_time_step_mse: 0.0101\n", "Epoch 14/20\n", "219/219 [==============================] - 4s 16ms/step - loss: 0.0204 - last_time_step_mse: 0.0099 - val_loss: 0.0204 - val_last_time_step_mse: 0.0099\n", "Epoch 15/20\n", "219/219 [==============================] - 4s 16ms/step - loss: 0.0202 - last_time_step_mse: 0.0097 - val_loss: 0.0199 - val_last_time_step_mse: 0.0093\n", "Epoch 16/20\n", "219/219 [==============================] - 3s 15ms/step - loss: 0.0200 - last_time_step_mse: 0.0097 - val_loss: 0.0201 - val_last_time_step_mse: 0.0095\n", "Epoch 17/20\n", "219/219 [==============================] - 3s 15ms/step - loss: 0.0196 - last_time_step_mse: 0.0093 - val_loss: 0.0197 - val_last_time_step_mse: 0.0091\n", "Epoch 18/20\n", "219/219 [==============================] - 3s 15ms/step - loss: 0.0194 - last_time_step_mse: 0.0090 - val_loss: 0.0192 - val_last_time_step_mse: 0.0086\n", "Epoch 19/20\n", "219/219 [==============================] - 4s 16ms/step - loss: 0.0190 - last_time_step_mse: 0.0087 - val_loss: 0.0188 - val_last_time_step_mse: 0.0084\n", "Epoch 20/20\n", "219/219 [==============================] - 3s 15ms/step - loss: 0.0186 - last_time_step_mse: 0.0083 - val_loss: 0.0184 - val_last_time_step_mse: 0.0080\n" ] } ] }, { "cell_type": "code", "source": [ "model.evaluate(X_valid, Y_valid[:, 3::2])" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "DlBLXViLcOxn", "outputId": "cf835442-e81f-4be9-8b09-22942cca6d01" }, "execution_count": 62, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "63/63 [==============================] - 1s 8ms/step - loss: 0.0184 - last_time_step_mse: 0.0080\n" ] }, { "output_type": "execute_result", "data": { "text/plain": [ "[0.018366245552897453, 0.00796824786812067]" ] }, "metadata": {}, "execution_count": 62 } ] }, { "cell_type": "markdown", "source": [ "# Natural-language processing" ], "metadata": { "id": "e16QVbDxcPb2" } }, { "cell_type": "markdown", "source": [ "## Preparing text data" ], "metadata": { "id": "Vq_YGZBzcZsn" } }, { "cell_type": "markdown", "source": [ "Vectorizing process using Python may be done as follows" ], "metadata": { "id": "ACRmbWfKdbvv" } }, { "cell_type": "code", "source": [ "import string\n", "\n", "class Vectorizer:\n", " def standardize(self, text):\n", " text = text.lower()\n", " return \"\".join(char for char in text if char not in string.punctuation)\n", "\n", " def tokenize(self, text):\n", " text = self.standardize(text)\n", " return text.split()\n", "\n", " def make_vocabulary(self, dataset):\n", " self.vocabulary = {\"\": 0, \"[UNK]\": 1}\n", " for text in dataset:\n", " text = self.standardize(text)\n", " tokens = self.tokenize(text)\n", " for token in tokens:\n", " if token not in self.vocabulary:\n", " self.vocabulary[token] = len(self.vocabulary)\n", " self.inverse_vocabulary = dict(\n", " (v, k) for k, v in self.vocabulary.items())\n", "\n", " def encode(self, text):\n", " text = self.standardize(text)\n", " tokens = self.tokenize(text)\n", " return [self.vocabulary.get(token, 1) for token in tokens]\n", "\n", " def decode(self, int_sequence):\n", " return \" \".join(\n", " self.inverse_vocabulary.get(i, \"[UNK]\") for i in int_sequence)\n", "\n", "vectorizer = Vectorizer()\n", "dataset = [\n", " \"I write, erase, rewrite\",\n", " \"Erase again, and then\",\n", " \"A poppy blooms.\",\n", "]\n", "vectorizer.make_vocabulary(dataset)" ], "metadata": { "id": "JPAOQHVAdZTR" }, "execution_count": 63, "outputs": [] }, { "cell_type": "code", "source": [ "test_sentence = \"I write, rewrite, and still rewrite again\"\n", "encoded_sentence = vectorizer.encode(test_sentence)\n", "print(encoded_sentence)" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "VGgm3yP3drce", "outputId": "6b368c1a-3ec3-48e9-e7f1-ad80eae4dc9c" }, "execution_count": 64, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "[2, 3, 5, 7, 1, 5, 6]\n" ] } ] }, { "cell_type": "code", "source": [ "decoded_sentence = vectorizer.decode(encoded_sentence)\n", "print(decoded_sentence)" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "yeVLfOrAdxRf", "outputId": "7f564f9f-8911-46dd-c9bd-04f4ffde58a4" }, "execution_count": 65, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "i write rewrite and [UNK] rewrite again\n" ] } ] }, { "cell_type": "markdown", "source": [ "However, using something like this wouldn’t be very performant. In practice, you’ll work with the Keras `TextVectorization` layer, which is fast and efficient and can be dropped directly into a `tf.data` pipeline or a Keras model." ], "metadata": { "id": "cF6Xq7YMcip3" } }, { "cell_type": "code", "source": [ "from tensorflow.keras.layers import TextVectorization\n", "# Configures the layer to return sequences of words encoded\n", "# as integer indices.\n", "text_vectorization = TextVectorization(\n", " output_mode=\"int\",\n", ")" ], "metadata": { "id": "7oWACab-cTGJ" }, "execution_count": 66, "outputs": [] }, { "cell_type": "markdown", "source": [ "By default, the TextVectorization layer will use the setting “convert to lowercase and remove punctuation” for text standardization, and “split on whitespace” for tokenization.\n", "\n", "But importantly, you can provide custom functions for standardization and tokenization, which means the layer is flexible enough to handle any use case.To index the vocabulary of a text corpus, just call the `adapt()` method of the layer with a `Dataset` object that yields strings, or just with a list of Python strings:" ], "metadata": { "id": "t8PWeaj0eUSp" } }, { "cell_type": "code", "source": [ "dataset = [\n", " \"I write, erase, rewrite\",\n", " \"Erase again, and then\",\n", " \"A poppy blooms.\",\n", "]\n", "text_vectorization.adapt(dataset)" ], "metadata": { "id": "9IqhI0sceL0Q" }, "execution_count": 67, "outputs": [] }, { "cell_type": "markdown", "source": [ "Note that you can retrieve the computed vocabulary via `get_vocabulary()`—this can be useful if you need to convert text encoded as integer sequences back into words. The first two entries in the vocabulary are the mask token (index 0) and the OOV token (index 1). Entries in the vocabulary list are sorted by frequency, so with a realworld dataset, very common words like “the” or “a” would come first." ], "metadata": { "id": "tMljNvynex1J" } }, { "cell_type": "code", "source": [ "text_vectorization.get_vocabulary()" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "y3XeB2HYegB6", "outputId": "3398c130-6067-4b33-9979-679a4d2eb66d" }, "execution_count": 68, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "['',\n", " '[UNK]',\n", " 'erase',\n", " 'write',\n", " 'then',\n", " 'rewrite',\n", " 'poppy',\n", " 'i',\n", " 'blooms',\n", " 'and',\n", " 'again',\n", " 'a']" ] }, "metadata": {}, "execution_count": 68 } ] }, { "cell_type": "code", "source": [ "vocabulary = text_vectorization.get_vocabulary()\n", "test_sentence = \"I write, rewrite, and still rewrite again\"\n", "encoded_sentence = text_vectorization(test_sentence)\n", "print(encoded_sentence)" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "3OldsH4Qe4Qb", "outputId": "da56367e-1b43-4285-ebf7-36d48b1451a6" }, "execution_count": 69, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "tf.Tensor([ 7 3 5 9 1 5 10], shape=(7,), dtype=int64)\n" ] } ] }, { "cell_type": "code", "source": [ "inverse_vocab = dict(enumerate(vocabulary))\n", "decoded_sentence = \" \".join(inverse_vocab[int(i)] for i in encoded_sentence)\n", "print(decoded_sentence)" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "v94pHFXUe-MO", "outputId": "b3a5e18d-79df-4594-8d6f-79f0933a4fa4" }, "execution_count": 70, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "i write rewrite and [UNK] rewrite again\n" ] } ] }, { "cell_type": "markdown", "source": [ "Before dive into the modeling part. We’ll demonstrate each approach on a well-known text classification benchmark: the IMDB movie review sentiment-classification dataset. Let’s process the raw\n", "IMDB text data, just like you would do when approaching a new text-classification problem in the real world. \n", "\n", "https://ai.stanford.edu/~amaas/data/sentiment/\n", "\n", "Let’s start by downloading the dataset from the Stanford page of Andrew Maas and uncompressing it" ], "metadata": { "id": "DVxg9nzSfPuY" } }, { "cell_type": "code", "source": [ "!curl -O https://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz\n", "!tar -xf aclImdb_v1.tar.gz" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "WWTJRDZXfOtX", "outputId": "b342fa45-bee0-423a-936a-d0176d393283" }, "execution_count": 71, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ " % Total % Received % Xferd Average Speed Time Time Time Current\n", " Dload Upload Total Spent Left Speed\n", "100 80.2M 100 80.2M 0 0 10.1M 0 0:00:07 0:00:07 --:--:-- 16.5M\n" ] } ] }, { "cell_type": "markdown", "source": [ "There’s also a `train/unsup` subdirectory in there, which we don’t need. Let’s\n", "delete it:" ], "metadata": { "id": "9Firy_UuhRf4" } }, { "cell_type": "code", "source": [ "!rm -r aclImdb/train/unsup" ], "metadata": { "id": "y5jDnHImg4lT" }, "execution_count": 72, "outputs": [] }, { "cell_type": "markdown", "source": [ "Take a look at the content of a few of these text files. Whether you’re working with text data or image data, remember to always inspect what your data looks like before you dive into modeling it." ], "metadata": { "id": "jdRkfu9uhXbx" } }, { "cell_type": "code", "source": [ "!cat aclImdb/train/pos/4077_10.txt" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "DzJMWYO0hZOO", "outputId": "250da9ff-4194-402a-a1a4-d6d791c92d1e" }, "execution_count": 73, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "I first saw this back in the early 90s on UK TV, i did like it then but i missed the chance to tape it, many years passed but the film always stuck with me and i lost hope of seeing it TV again, the main thing that stuck with me was the end, the hole castle part really touched me, its easy to watch, has a great story, great music, the list goes on and on, its OK me saying how good it is but everyone will take there own best bits away with them once they have seen it, yes the animation is top notch and beautiful to watch, it does show its age in a very few parts but that has now become part of it beauty, i am so glad it has came out on DVD as it is one of my top 10 films of all time. Buy it or rent it just see it, best viewing is at night alone with drink and food in reach so you don't have to stop the film.

Enjoy" ] } ] }, { "cell_type": "markdown", "source": [ "Next, let’s prepare a validation set by setting apart 20% of the training text files in a new directory, aclImdb/val:" ], "metadata": { "id": "oEplxGV0hcyS" } }, { "cell_type": "code", "source": [ "import os, pathlib, shutil, random\n", "\n", "base_dir = pathlib.Path(\"aclImdb\")\n", "val_dir = base_dir / \"val\"\n", "train_dir = base_dir / \"train\"\n", "for category in (\"neg\", \"pos\"):\n", " os.makedirs(val_dir / category)\n", " files = os.listdir(train_dir / category)\n", " random.Random(1337).shuffle(files)\n", " num_val_samples = int(0.2 * len(files))\n", " val_files = files[-num_val_samples:]\n", " for fname in val_files:\n", " shutil.move(train_dir / category / fname,\n", " val_dir / category / fname)" ], "metadata": { "id": "uI1RHMXfheLP" }, "execution_count": 74, "outputs": [] }, { "cell_type": "markdown", "source": [ "Let’s create three Dataset objects for training, validation, and testing just like previous lab:" ], "metadata": { "id": "9rwl4rFphkdx" } }, { "cell_type": "code", "source": [ "from tensorflow import keras\n", "batch_size = 32\n", "\n", "train_ds = keras.utils.text_dataset_from_directory(\n", " \"aclImdb/train\", batch_size=batch_size\n", ")\n", "val_ds = keras.utils.text_dataset_from_directory(\n", " \"aclImdb/val\", batch_size=batch_size\n", ")\n", "test_ds = keras.utils.text_dataset_from_directory(\n", " \"aclImdb/test\", batch_size=batch_size\n", ")" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "672YOyLXhnyI", "outputId": "63913c02-61eb-44ab-bff5-7e830afa81e2" }, "execution_count": 75, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "Found 20000 files belonging to 2 classes.\n", "Found 5000 files belonging to 2 classes.\n", "Found 25000 files belonging to 2 classes.\n" ] } ] }, { "cell_type": "markdown", "source": [ "These datasets yield inputs that are TensorFlow `tf.string` tensors and targets that are `int32` tensors encoding the value “0” or “1.”" ], "metadata": { "id": "tILAl18fh9FA" } }, { "cell_type": "code", "source": [ "for inputs, targets in train_ds:\n", " print(\"inputs.shape:\", inputs.shape)\n", " print(\"inputs.dtype:\", inputs.dtype)\n", " print(\"targets.shape:\", targets.shape)\n", " print(\"targets.dtype:\", targets.dtype)\n", " print(\"inputs[0]:\", inputs[0])\n", " print(\"targets[0]:\", targets[0])\n", " break" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "gD_XsKfih3XR", "outputId": "9a9ceccd-17c1-4f8b-d128-6449ef5f28c0" }, "execution_count": 76, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "inputs.shape: (32,)\n", "inputs.dtype: \n", "targets.shape: (32,)\n", "targets.dtype: \n", "inputs[0]: tf.Tensor(b'There\\'s nothing really to dislike about \"The Odd Couple,\" and it\\'s no surprise that Jack Lemmon and Walter Matthau make a hugely winning comedic team. But there\\'s something so underdeveloped about Neil Simon\\'s adaptation of his hit stage play as to make it seem more like a skit on a sketch comedy show than a full-bodied film. I have not seen the play, but have to assume that the screen version is fairly faithful, since Simon wrote it, so the defects cannot be blamed on a stage-to-screen adaptation. There are some interesting ideas in this story--two recently divorced men who fall immediately into traditional married roles when they become roommates because neither knows any differently--that Simon never fully fleshes out. Still, there are many worse ways to kill a couple of hours.', shape=(), dtype=string)\n", "targets[0]: tf.Tensor(1, shape=(), dtype=int32)\n" ] } ] }, { "cell_type": "markdown", "source": [ "## Processing words as a set: The bag-of-words approach" ], "metadata": { "id": "nXU16HSgiGAK" } }, { "cell_type": "markdown", "source": [ "### Single words (unigrams) with binary encoding" ], "metadata": { "id": "6akoHLcIkVXr" } }, { "cell_type": "markdown", "source": [ "First, let’s process our raw text datasets with a `TextVectorization` layer so that they yield multi-hot encoded binary word vectors. Our layer will only look at single words (that is to say, unigrams). We will limit the vocabulary to the 20,000 most frequent words. Otherwise we’d be indexing every word in the training data— potentially tens of thousands of terms that only occur once or\n", "twice and thus aren’t informative. In general, 20,000 is the right vocabulary size for text classification." ], "metadata": { "id": "E0-NGwoeiNvx" } }, { "cell_type": "code", "source": [ "# Encode the output tokens as multi-hot binary vectors.\n", "text_vectorization = TextVectorization(\n", " max_tokens=20000,\n", " output_mode=\"multi_hot\",\n", ")\n", "# Prepare a dataset that only yields raw text inputs (no labels).\n", "text_only_train_ds = train_ds.map(lambda x, y: x)\n", "text_vectorization.adapt(text_only_train_ds)\n", "\n", "binary_1gram_train_ds = train_ds.map(\n", " lambda x, y: (text_vectorization(x), y),\n", " num_parallel_calls=4)\n", "binary_1gram_val_ds = val_ds.map(\n", " lambda x, y: (text_vectorization(x), y),\n", " num_parallel_calls=4)\n", "binary_1gram_test_ds = test_ds.map(\n", " lambda x, y: (text_vectorization(x), y),\n", " num_parallel_calls=4)" ], "metadata": { "id": "SV_WnlbKh2xK" }, "execution_count": 77, "outputs": [] }, { "cell_type": "code", "source": [ "for inputs, targets in binary_1gram_train_ds:\n", " print(\"inputs.shape:\", inputs.shape)\n", " print(\"inputs.dtype:\", inputs.dtype)\n", " print(\"targets.shape:\", targets.shape)\n", " print(\"targets.dtype:\", targets.dtype)\n", " print(\"inputs[0]:\", inputs[0])\n", " print(\"targets[0]:\", targets[0])\n", " break" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "rgJwHXctiuIH", "outputId": "b167570c-c6d3-4b6e-c0d3-d4ac00780454" }, "execution_count": 78, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "inputs.shape: (32, 20000)\n", "inputs.dtype: \n", "targets.shape: (32,)\n", "targets.dtype: \n", "inputs[0]: tf.Tensor([1. 1. 1. ... 0. 0. 0.], shape=(20000,), dtype=float32)\n", "targets[0]: tf.Tensor(0, shape=(), dtype=int32)\n" ] } ] }, { "cell_type": "code", "source": [ "from tensorflow import keras\n", "from tensorflow.keras import layers\n", "\n", "# A densely connected NN\n", "def get_model(max_tokens=20000, hidden_dim=16):\n", " inputs = keras.Input(shape=(max_tokens,))\n", " x = layers.Dense(hidden_dim, activation=\"relu\")(inputs)\n", " x = layers.Dropout(0.5)(x)\n", " outputs = layers.Dense(1, activation=\"sigmoid\")(x)\n", " model = keras.Model(inputs, outputs)\n", " model.compile(optimizer=\"nadam\",\n", " loss=\"binary_crossentropy\",\n", " metrics=[\"accuracy\"])\n", " return model" ], "metadata": { "id": "DqwEhyqfiyma" }, "execution_count": 79, "outputs": [] }, { "cell_type": "markdown", "source": [ "Finally, let’s train and test our model." ], "metadata": { "id": "yMTMyNB5j0xq" } }, { "cell_type": "code", "source": [ "model = get_model()\n", "model.summary()\n", "callbacks = [\n", " keras.callbacks.ModelCheckpoint(\"binary_1gram.keras\",\n", " save_best_only=True)\n", "]\n", "model.fit(binary_1gram_train_ds.cache(),\n", " validation_data=binary_1gram_val_ds.cache(),\n", " epochs=10,\n", " callbacks=callbacks)\n", "model = keras.models.load_model(\"binary_1gram.keras\")\n", "print(f\"Test acc: {model.evaluate(binary_1gram_test_ds)[1]:.3f}\")" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "h1ws10OIjt_X", "outputId": "eba30619-0f49-4880-d7e9-219acae9bc26" }, "execution_count": 80, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "Model: \"model\"\n", "_________________________________________________________________\n", " Layer (type) Output Shape Param # \n", "=================================================================\n", " input_1 (InputLayer) [(None, 20000)] 0 \n", " \n", " dense_8 (Dense) (None, 16) 320016 \n", " \n", " dropout (Dropout) (None, 16) 0 \n", " \n", " dense_9 (Dense) (None, 1) 17 \n", " \n", "=================================================================\n", "Total params: 320,033\n", "Trainable params: 320,033\n", "Non-trainable params: 0\n", "_________________________________________________________________\n", "Epoch 1/10\n", "625/625 [==============================] - 13s 16ms/step - loss: 0.3879 - accuracy: 0.8375 - val_loss: 0.2680 - val_accuracy: 0.8932\n", "Epoch 2/10\n", "625/625 [==============================] - 4s 6ms/step - loss: 0.2204 - accuracy: 0.9155 - val_loss: 0.2628 - val_accuracy: 0.8892\n", "Epoch 3/10\n", "625/625 [==============================] - 4s 6ms/step - loss: 0.1569 - accuracy: 0.9441 - val_loss: 0.2857 - val_accuracy: 0.8852\n", "Epoch 4/10\n", "625/625 [==============================] - 4s 6ms/step - loss: 0.1172 - accuracy: 0.9590 - val_loss: 0.3216 - val_accuracy: 0.8860\n", "Epoch 5/10\n", "625/625 [==============================] - 4s 6ms/step - loss: 0.0866 - accuracy: 0.9699 - val_loss: 0.3537 - val_accuracy: 0.8838\n", "Epoch 6/10\n", "625/625 [==============================] - 4s 6ms/step - loss: 0.0733 - accuracy: 0.9723 - val_loss: 0.3830 - val_accuracy: 0.8850\n", "Epoch 7/10\n", "625/625 [==============================] - 4s 6ms/step - loss: 0.0600 - accuracy: 0.9764 - val_loss: 0.4238 - val_accuracy: 0.8848\n", "Epoch 8/10\n", "625/625 [==============================] - 4s 6ms/step - loss: 0.0557 - accuracy: 0.9786 - val_loss: 0.4402 - val_accuracy: 0.8814\n", "Epoch 9/10\n", "625/625 [==============================] - 4s 6ms/step - loss: 0.0472 - accuracy: 0.9808 - val_loss: 0.4760 - val_accuracy: 0.8854\n", "Epoch 10/10\n", "625/625 [==============================] - 4s 6ms/step - loss: 0.0446 - accuracy: 0.9818 - val_loss: 0.5129 - val_accuracy: 0.8838\n", "782/782 [==============================] - 11s 14ms/step - loss: 0.2819 - accuracy: 0.8843\n", "Test acc: 0.884\n" ] } ] }, { "cell_type": "markdown", "source": [ "This gets us to a test accuracy of 88.4%: not bad!" ], "metadata": { "id": "YbuIiJwYkGiR" } }, { "cell_type": "markdown", "source": [ "### Bigrams with binary encoding" ], "metadata": { "id": "2SoPSzSrkOXL" } }, { "cell_type": "markdown", "source": [ "The `TextVectorization` layer can be configured to return arbitrary N-grams: bigrams, trigrams, etc. Just pass an `ngrams=N` argument as in the following listing." ], "metadata": { "id": "YOwpIaJjkeka" } }, { "cell_type": "code", "source": [ "text_vectorization = TextVectorization(\n", " ngrams=2,\n", " max_tokens=20000,\n", " output_mode=\"multi_hot\",\n", ")" ], "metadata": { "id": "GC3ZrIQmj3VB" }, "execution_count": 81, "outputs": [] }, { "cell_type": "code", "source": [ "text_vectorization.adapt(text_only_train_ds)\n", "binary_2gram_train_ds = train_ds.map(\n", " lambda x, y: (text_vectorization(x), y),\n", " num_parallel_calls=4)\n", "binary_2gram_val_ds = val_ds.map(\n", " lambda x, y: (text_vectorization(x), y),\n", " num_parallel_calls=4)\n", "binary_2gram_test_ds = test_ds.map(\n", " lambda x, y: (text_vectorization(x), y),\n", " num_parallel_calls=4)\n", "\n", "model = get_model()\n", "model.summary()\n", "callbacks = [\n", " keras.callbacks.ModelCheckpoint(\"binary_2gram.keras\",\n", " save_best_only=True)\n", "]\n", "model.fit(binary_2gram_train_ds.cache(),\n", " validation_data=binary_2gram_val_ds.cache(),\n", " epochs=10,\n", " callbacks=callbacks)\n", "model = keras.models.load_model(\"binary_2gram.keras\")\n", "print(f\"Test acc: {model.evaluate(binary_2gram_test_ds)[1]:.3f}\")" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "SdEQkw3mkhk9", "outputId": "5cfc6f2d-3521-43e2-8c05-d995b9824d7c" }, "execution_count": 82, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "Model: \"model_1\"\n", "_________________________________________________________________\n", " Layer (type) Output Shape Param # \n", "=================================================================\n", " input_2 (InputLayer) [(None, 20000)] 0 \n", " \n", " dense_10 (Dense) (None, 16) 320016 \n", " \n", " dropout_1 (Dropout) (None, 16) 0 \n", " \n", " dense_11 (Dense) (None, 1) 17 \n", " \n", "=================================================================\n", "Total params: 320,033\n", "Trainable params: 320,033\n", "Non-trainable params: 0\n", "_________________________________________________________________\n", "Epoch 1/10\n", "625/625 [==============================] - 24s 37ms/step - loss: 0.3597 - accuracy: 0.8476 - val_loss: 0.2467 - val_accuracy: 0.8980\n", "Epoch 2/10\n", "625/625 [==============================] - 4s 6ms/step - loss: 0.1961 - accuracy: 0.9275 - val_loss: 0.2507 - val_accuracy: 0.8930\n", "Epoch 3/10\n", "625/625 [==============================] - 5s 8ms/step - loss: 0.1328 - accuracy: 0.9531 - val_loss: 0.2753 - val_accuracy: 0.8994\n", "Epoch 4/10\n", "625/625 [==============================] - 4s 6ms/step - loss: 0.0990 - accuracy: 0.9656 - val_loss: 0.3075 - val_accuracy: 0.8952\n", "Epoch 5/10\n", "625/625 [==============================] - 4s 6ms/step - loss: 0.0778 - accuracy: 0.9732 - val_loss: 0.3100 - val_accuracy: 0.8950\n", "Epoch 6/10\n", "625/625 [==============================] - 4s 6ms/step - loss: 0.0642 - accuracy: 0.9763 - val_loss: 0.3855 - val_accuracy: 0.8902\n", "Epoch 7/10\n", "625/625 [==============================] - 5s 7ms/step - loss: 0.0558 - accuracy: 0.9809 - val_loss: 0.3822 - val_accuracy: 0.8930\n", "Epoch 8/10\n", "625/625 [==============================] - 4s 6ms/step - loss: 0.0494 - accuracy: 0.9816 - val_loss: 0.4002 - val_accuracy: 0.8916\n", "Epoch 9/10\n", "625/625 [==============================] - 4s 6ms/step - loss: 0.0451 - accuracy: 0.9825 - val_loss: 0.4233 - val_accuracy: 0.8892\n", "Epoch 10/10\n", "625/625 [==============================] - 4s 6ms/step - loss: 0.0419 - accuracy: 0.9846 - val_loss: 0.4674 - val_accuracy: 0.8910\n", "782/782 [==============================] - 9s 11ms/step - loss: 0.2605 - accuracy: 0.8953\n", "Test acc: 0.895\n" ] } ] }, { "cell_type": "markdown", "source": [ "We’re now getting 89.5% test accuracy, a marked improvement! Turns out local order is pretty important." ], "metadata": { "id": "0aGRj_DRkqtD" } }, { "cell_type": "markdown", "source": [ "### Bigrams with TF-IDF encoding" ], "metadata": { "id": "CB3pEfxck15o" } }, { "cell_type": "markdown", "source": [ "TF-IDF is so common that it’s built into the TextVectorization layer. All you need to do to start using it is to switch the output_mode argument to `tf_idf`." ], "metadata": { "id": "tPG88cwWk5zS" } }, { "cell_type": "code", "source": [ "from keras.layers import TextVectorization as TexVec" ], "metadata": { "id": "uRaK89lhnenM" }, "execution_count": 91, "outputs": [] }, { "cell_type": "code", "source": [ "text_vectorization = TexVec(\n", " ngrams=2,\n", " max_tokens=20000,\n", " output_mode=\"tf_idf\",\n", ")\n", "\n", "text_vectorization.adapt(text_only_train_ds)" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 503 }, "id": "m2V7i76UklOO", "outputId": "fd33c64c-57cd-44ba-829c-99ddcb61c660" }, "execution_count": 92, "outputs": [ { "output_type": "error", "ename": "InvalidArgumentError", "evalue": "ignored", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mInvalidArgumentError\u001b[0m Traceback (most recent call last)", "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m()\u001b[0m\n\u001b[1;32m 5\u001b[0m )\n\u001b[1;32m 6\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 7\u001b[0;31m \u001b[0mtext_vectorization\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0madapt\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mtext_only_train_ds\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[0;32m/usr/local/lib/python3.7/dist-packages/keras/layers/preprocessing/text_vectorization.py\u001b[0m in \u001b[0;36madapt\u001b[0;34m(self, data, batch_size, steps)\u001b[0m\n\u001b[1;32m 426\u001b[0m \u001b[0margument\u001b[0m \u001b[0;32mis\u001b[0m \u001b[0;32mnot\u001b[0m \u001b[0msupported\u001b[0m \u001b[0;32mwith\u001b[0m \u001b[0marray\u001b[0m \u001b[0minputs\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 427\u001b[0m \"\"\"\n\u001b[0;32m--> 428\u001b[0;31m \u001b[0msuper\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0madapt\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mdata\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mbatch_size\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mbatch_size\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0msteps\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0msteps\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 429\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 430\u001b[0m \u001b[0;32mdef\u001b[0m \u001b[0mupdate_state\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mdata\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;32m/usr/local/lib/python3.7/dist-packages/keras/engine/base_preprocessing_layer.py\u001b[0m in \u001b[0;36madapt\u001b[0;34m(self, data, batch_size, steps)\u001b[0m\n\u001b[1;32m 247\u001b[0m \u001b[0;32mwith\u001b[0m \u001b[0mdata_handler\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mcatch_stop_iteration\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 248\u001b[0m \u001b[0;32mfor\u001b[0m \u001b[0m_\u001b[0m \u001b[0;32min\u001b[0m \u001b[0mdata_handler\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0msteps\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 249\u001b[0;31m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_adapt_function\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0miterator\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 250\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mdata_handler\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mshould_sync\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 251\u001b[0m \u001b[0mcontext\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0masync_wait\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;32m/usr/local/lib/python3.7/dist-packages/tensorflow/python/util/traceback_utils.py\u001b[0m in \u001b[0;36merror_handler\u001b[0;34m(*args, **kwargs)\u001b[0m\n\u001b[1;32m 151\u001b[0m \u001b[0;32mexcept\u001b[0m \u001b[0mException\u001b[0m \u001b[0;32mas\u001b[0m \u001b[0me\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 152\u001b[0m \u001b[0mfiltered_tb\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0m_process_traceback_frames\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0me\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m__traceback__\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 153\u001b[0;31m \u001b[0;32mraise\u001b[0m \u001b[0me\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mwith_traceback\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mfiltered_tb\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;32mfrom\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 154\u001b[0m \u001b[0;32mfinally\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 155\u001b[0m \u001b[0;32mdel\u001b[0m \u001b[0mfiltered_tb\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;32m/usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/execute.py\u001b[0m in \u001b[0;36mquick_execute\u001b[0;34m(op_name, num_outputs, inputs, attrs, ctx, name)\u001b[0m\n\u001b[1;32m 53\u001b[0m \u001b[0mctx\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mensure_initialized\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 54\u001b[0m tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,\n\u001b[0;32m---> 55\u001b[0;31m inputs, attrs, num_outputs)\n\u001b[0m\u001b[1;32m 56\u001b[0m \u001b[0;32mexcept\u001b[0m \u001b[0mcore\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_NotOkStatusException\u001b[0m \u001b[0;32mas\u001b[0m \u001b[0me\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 57\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mname\u001b[0m \u001b[0;32mis\u001b[0m \u001b[0;32mnot\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;31mInvalidArgumentError\u001b[0m: Graph execution error:\n\n2 root error(s) found.\n (0) INVALID_ARGUMENT: During Variant Host->Device Copy: non-DMA-copy attempted of tensor type: string\n\t [[{{node map/TensorArrayUnstack/TensorListFromTensor/_42}}]]\n\t [[add/_44]]\n (1) INVALID_ARGUMENT: During Variant Host->Device Copy: non-DMA-copy attempted of tensor type: string\n\t [[{{node map/TensorArrayUnstack/TensorListFromTensor/_42}}]]\n0 successful operations.\n0 derived errors ignored. [Op:__inference_adapt_step_304531]" ] } ] }, { "cell_type": "code", "source": [ "text_vectorization.adapt(text_only_train_ds)\n", "\n", "tfidf_2gram_train_ds = train_ds.map(\n", " lambda x, y: (text_vectorization(x), y),\n", " num_parallel_calls=4)\n", "tfidf_2gram_val_ds = val_ds.map(\n", " lambda x, y: (text_vectorization(x), y),\n", " num_parallel_calls=4)\n", "tfidf_2gram_test_ds = test_ds.map(\n", " lambda x, y: (text_vectorization(x), y),\n", " num_parallel_calls=4)\n", "\n", "model = get_model()\n", "model.summary()\n", "callbacks = [\n", " keras.callbacks.ModelCheckpoint(\"tfidf_2gram.keras\",\n", " save_best_only=True)\n", "]\n", "model.fit(tfidf_2gram_train_ds.cache(),\n", " validation_data=tfidf_2gram_val_ds.cache(),\n", " epochs=10,\n", " callbacks=callbacks)\n", "model = keras.models.load_model(\"tfidf_2gram.keras\")\n", "print(f\"Test acc: {model.evaluate(tfidf_2gram_test_ds)[1]:.3f}\")" ], "metadata": { "id": "MDp4V3VjlGVh" }, "execution_count": null, "outputs": [] }, { "cell_type": "markdown", "source": [ "This gets us an 89.8% test accuracy on the IMDB classification task: it doesn’t seem to be particularly helpful in this case. However, for many text-classification datasets, it would be typical to see a one-percentage-point increase when using TF-IDF compared to plain binary encoding." ], "metadata": { "id": "qoz77TP4oewU" } }, { "cell_type": "markdown", "source": [ "## Processing words as a sequence: The sequence model approach" ], "metadata": { "id": "ZC7yKzENvV5N" } }, { "cell_type": "markdown", "source": [ "Let’s try out a first sequence model in practice. First, let’s prepare datasets that return integer sequences. In order to keep a manageable input size, **we’ll truncate the inputs after the first 600 words.**\n", "\n", "This is a reasonable choice, since the average review length is 233 words, and only 5% of reviews are longer than 600 words." ], "metadata": { "id": "IC3yN0qevZM2" } }, { "cell_type": "code", "source": [ "max_length = 600\n", "max_tokens = 20000\n", "text_vectorization = layers.TextVectorization(\n", " max_tokens=max_tokens,\n", " output_mode=\"int\",\n", " output_sequence_length=max_length,\n", ")\n", "text_vectorization.adapt(text_only_train_ds)\n", "\n", "int_train_ds = train_ds.map(\n", " lambda x, y: (text_vectorization(x), y),\n", " num_parallel_calls=4)\n", "int_val_ds = val_ds.map(\n", " lambda x, y: (text_vectorization(x), y),\n", " num_parallel_calls=4)\n", "int_test_ds = test_ds.map(\n", " lambda x, y: (text_vectorization(x), y),\n", " num_parallel_calls=4)" ], "metadata": { "id": "8__1VvdClKay" }, "execution_count": 93, "outputs": [] }, { "cell_type": "markdown", "source": [ "Next, let’s make a model. The simplest way to convert our integer sequences to vector sequences is to one-hot encode the integers (each dimension would represent one possible term in the vocabulary). On top of these one-hot vectors, we’ll add a simple bidirectional LSTM." ], "metadata": { "id": "ezd52NzTvmwl" } }, { "cell_type": "code", "source": [ "inputs = keras.Input(shape=(None,), dtype=\"int64\") # One input is a sequence of integers\n", "embedded = tf.one_hot(inputs, depth=max_tokens) # A 3D tensor of shape [batch size, time steps, embedding size]\n", "x = layers.Bidirectional(layers.LSTM(32))(embedded)\n", "x = layers.Dropout(0.5)(x)\n", "outputs = layers.Dense(1, activation=\"sigmoid\")(x) # Classification layer\n", "model = keras.Model(inputs, outputs)\n", "model.compile(optimizer=\"nadam\",\n", " loss=\"binary_crossentropy\",\n", " metrics=[\"accuracy\"])\n", "model.summary()" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "jEGBs6wjvs5s", "outputId": "29b2e209-9bff-41ad-8f6d-cd443ffe2480" }, "execution_count": 99, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "Model: \"model_5\"\n", "_________________________________________________________________\n", " Layer (type) Output Shape Param # \n", "=================================================================\n", " input_8 (InputLayer) [(None, None)] 0 \n", " \n", " tf.one_hot_5 (TFOpLambda) (None, None, 20000) 0 \n", " \n", " bidirectional_3 (Bidirectio (None, 64) 5128448 \n", " nal) \n", " \n", " dropout_5 (Dropout) (None, 64) 0 \n", " \n", " dense_15 (Dense) (None, 1) 65 \n", " \n", "=================================================================\n", "Total params: 5,128,513\n", "Trainable params: 5,128,513\n", "Non-trainable params: 0\n", "_________________________________________________________________\n" ] } ] }, { "cell_type": "markdown", "source": [ "A first observation: this model will train very slowly, especially compared to the lightweight model of the previous section. This is because our inputs are quite large: each input sample is encoded as a matrix of size `(600, 20000)` (600 words per sample, 20,000 possible words). That’s 12,000,000 floats for a single movie review. Our bidirectional LSTM has a lot of work to do.\n", "\n", "Let's try word embedding. What makes a good word-embedding space depends heavily on your task: the perfect word-embedding space for an English-language movie-review sentiment-analysis model may look different from the perfect embedding space for an English-language legal-document classification model, because the importance of certain semantic relationships varies from task to task. It’s thus reasonable to learn a new embedding space with every new task. Fortunately, backpropagation makes this easy, and Keras makes it even easier. It’s about learning the weights of a layer: the Embedding layer." ], "metadata": { "id": "fUDH3SWRz9u3" } }, { "cell_type": "code", "source": [ "embedding_layer = layers.Embedding(input_dim=max_tokens, output_dim=256)" ], "metadata": { "id": "YYAaa3CNzbWY" }, "execution_count": 102, "outputs": [] }, { "cell_type": "markdown", "source": [ "The Embedding layer is best understood as a dictionary that maps integer indices (which stand for specific words) to dense vectors. The Embedding layer takes as input a rank-2 tensor of integers, of shape `(batch_size,sequence_length)`, where each entry is a sequence of integers. The layer then returns a 3D floating-point tensor of shape` (batch_size, sequence_length, embedding_dimensionality)`.\n", "\n", "When you instantiate an Embedding layer, its weights (its internal dictionary of\n", "token vectors) are initially random, just as with any other layer. During training, these word vectors are gradually adjusted via backpropagation, structuring the space into something the downstream model can exploit. Once fully trained, the embedding space will show a lot of structure—a kind of structure specialized for the specific problem for which you’re training your model. We will talk more about embedding in Lecture 7." ], "metadata": { "id": "aspfIuYU1b83" } }, { "cell_type": "code", "source": [ "inputs = keras.Input(shape=(None,), dtype=\"int64\")\n", "embedded = layers.Embedding(input_dim=max_tokens, output_dim=256)(inputs)\n", "x = layers.Bidirectional(layers.LSTM(32))(embedded)\n", "x = layers.Dropout(0.5)(x)\n", "outputs = layers.Dense(1, activation=\"sigmoid\")(x)\n", "model = keras.Model(inputs, outputs)\n", "model.compile(optimizer=\"rmsprop\",\n", " loss=\"binary_crossentropy\",\n", " metrics=[\"accuracy\"])\n", "model.summary()\n", "\n", "callbacks = [\n", " keras.callbacks.ModelCheckpoint(\"embeddings_bidir_gru.keras\",\n", " save_best_only=True)\n", "]\n", "model.fit(int_train_ds, validation_data=int_val_ds, epochs=10, callbacks=callbacks)\n", "model = keras.models.load_model(\"embeddings_bidir_gru.keras\")\n", "print(f\"Test acc: {model.evaluate(int_test_ds)[1]:.3f}\")" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "d1F2Z7fj1UJ-", "outputId": "0d76ca08-5e7c-4951-8d2f-508e0cc0f415" }, "execution_count": 103, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "Model: \"model_6\"\n", "_________________________________________________________________\n", " Layer (type) Output Shape Param # \n", "=================================================================\n", " input_9 (InputLayer) [(None, None)] 0 \n", " \n", " embedding_1 (Embedding) (None, None, 256) 5120000 \n", " \n", " bidirectional_4 (Bidirectio (None, 64) 73984 \n", " nal) \n", " \n", " dropout_6 (Dropout) (None, 64) 0 \n", " \n", " dense_16 (Dense) (None, 1) 65 \n", " \n", "=================================================================\n", "Total params: 5,194,049\n", "Trainable params: 5,194,049\n", "Non-trainable params: 0\n", "_________________________________________________________________\n", "Epoch 1/10\n", "625/625 [==============================] - 162s 250ms/step - loss: 0.4738 - accuracy: 0.7908 - val_loss: 0.3579 - val_accuracy: 0.8644\n", "Epoch 2/10\n", "625/625 [==============================] - 163s 258ms/step - loss: 0.3071 - accuracy: 0.8917 - val_loss: 0.3044 - val_accuracy: 0.8774\n", "Epoch 3/10\n", "625/625 [==============================] - 242s 385ms/step - loss: 0.2422 - accuracy: 0.9157 - val_loss: 0.3072 - val_accuracy: 0.8870\n", "Epoch 4/10\n", "625/625 [==============================] - 227s 363ms/step - loss: 0.2074 - accuracy: 0.9307 - val_loss: 0.3473 - val_accuracy: 0.8794\n", "Epoch 5/10\n", "625/625 [==============================] - 144s 229ms/step - loss: 0.1751 - accuracy: 0.9432 - val_loss: 0.3301 - val_accuracy: 0.8856\n", "Epoch 6/10\n", "625/625 [==============================] - 199s 316ms/step - loss: 0.1413 - accuracy: 0.9546 - val_loss: 0.4382 - val_accuracy: 0.8706\n", "Epoch 7/10\n", "625/625 [==============================] - 248s 393ms/step - loss: 0.1254 - accuracy: 0.9586 - val_loss: 0.3836 - val_accuracy: 0.8816\n", "Epoch 8/10\n", "625/625 [==============================] - 188s 298ms/step - loss: 0.1065 - accuracy: 0.9660 - val_loss: 0.3809 - val_accuracy: 0.8860\n", "Epoch 9/10\n", "625/625 [==============================] - 198s 314ms/step - loss: 0.0907 - accuracy: 0.9717 - val_loss: 0.4376 - val_accuracy: 0.8440\n", "Epoch 10/10\n", "625/625 [==============================] - 286s 454ms/step - loss: 0.0789 - accuracy: 0.9750 - val_loss: 0.4318 - val_accuracy: 0.8844\n", "782/782 [==============================] - 119s 148ms/step - loss: 0.3496 - accuracy: 0.8573\n", "Test acc: 0.857\n" ] } ] }, { "cell_type": "markdown", "source": [ "We’re still some way off from the results of our basic bigram model. Part of the reason why is simply that the model is looking at slightly less data:\n", "the bigram model processed full reviews, while our sequence model truncates sequences after 600 words. One thing that’s slightly hurting model performance here is that our input sequences are full of zeros. This comes from our use of the `output_sequence_length=max_length` option in `TextVectorization` (with max_length equal to 600): **sentences longer than 600 tokens are truncated to a length of 600 tokens, and sentences shorter than 600 tokens are padded with zeros** at the end so that they can be concatenated together with other sequences to form contiguous batches.\n", "\n", "We’re using a bidirectional RNN: two RNN layers running in parallel, with one processing the tokens in their natural order, and the other processing the same\n", "tokens in reverse. **The RNN that looks at the tokens in their natural order will spend its last iterations seeing only vectors that encode padding—possibly for several hundreds of iterations if the original sentence was short**. The information stored in the internal state of the RNN will gradually fade out as it gets exposed to these meaningless inputs.\n", "\n", "We need some way to tell the RNN that it should skip these iterations. There’s an API for that: *masking*. The Embedding layer is capable of generating a“mask” that corresponds to its input data. This mask is a tensor of ones and zeros (or `True/False` booleans), of shape `(batch_size, sequence_length)`,where the entry `mask[i, t]` indicates where timestep t of sample i should be skipped or not (the timestep will be skipped if `mask[i, t]` is 0 or False, and processed otherwise)." ], "metadata": { "id": "_3dQg0bM2Ks3" } }, { "cell_type": "code", "source": [ "inputs = keras.Input(shape=(None,), dtype=\"int64\")\n", "embedded = layers.Embedding(\n", " input_dim=max_tokens, output_dim=256, mask_zero=True)(inputs) # You can turn it on by passing mask_zero=True\n", "x = layers.Bidirectional(layers.LSTM(32))(embedded)\n", "x = layers.Dropout(0.5)(x)\n", "outputs = layers.Dense(1, activation=\"sigmoid\")(x)\n", "model = keras.Model(inputs, outputs)\n", "model.compile(optimizer=\"rmsprop\",\n", " loss=\"binary_crossentropy\",\n", " metrics=[\"accuracy\"])\n", "model.summary()\n", "\n", "callbacks = [\n", " keras.callbacks.ModelCheckpoint(\"embeddings_bidir_gru_with_masking.keras\",\n", " save_best_only=True)\n", "]\n", "model.fit(int_train_ds, validation_data=int_val_ds, epochs=10, callbacks=callbacks)\n", "model = keras.models.load_model(\"embeddings_bidir_gru_with_masking.keras\")\n", "print(f\"Test acc: {model.evaluate(int_test_ds)[1]:.3f}\")" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "h--MrGck12LD", "outputId": "48d231b8-a68d-4644-c2a3-d9add6b8cd68" }, "execution_count": 104, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "Model: \"model_7\"\n", "_________________________________________________________________\n", " Layer (type) Output Shape Param # \n", "=================================================================\n", " input_10 (InputLayer) [(None, None)] 0 \n", " \n", " embedding_2 (Embedding) (None, None, 256) 5120000 \n", " \n", " bidirectional_5 (Bidirectio (None, 64) 73984 \n", " nal) \n", " \n", " dropout_7 (Dropout) (None, 64) 0 \n", " \n", " dense_17 (Dense) (None, 1) 65 \n", " \n", "=================================================================\n", "Total params: 5,194,049\n", "Trainable params: 5,194,049\n", "Non-trainable params: 0\n", "_________________________________________________________________\n", "Epoch 1/10\n", "625/625 [==============================] - 187s 280ms/step - loss: 0.4069 - accuracy: 0.8126 - val_loss: 0.2861 - val_accuracy: 0.8832\n", "Epoch 2/10\n", "625/625 [==============================] - 189s 299ms/step - loss: 0.2301 - accuracy: 0.9140 - val_loss: 0.2878 - val_accuracy: 0.8840\n", "Epoch 3/10\n", "625/625 [==============================] - 159s 252ms/step - loss: 0.1745 - accuracy: 0.9373 - val_loss: 0.2828 - val_accuracy: 0.8914\n", "Epoch 4/10\n", "625/625 [==============================] - 178s 282ms/step - loss: 0.1292 - accuracy: 0.9537 - val_loss: 0.3062 - val_accuracy: 0.8916\n", "Epoch 5/10\n", "625/625 [==============================] - 231s 366ms/step - loss: 0.0929 - accuracy: 0.9671 - val_loss: 0.3851 - val_accuracy: 0.8924\n", "Epoch 6/10\n", "625/625 [==============================] - 225s 357ms/step - loss: 0.0725 - accuracy: 0.9752 - val_loss: 0.3541 - val_accuracy: 0.8862\n", "Epoch 7/10\n", "625/625 [==============================] - 206s 326ms/step - loss: 0.0489 - accuracy: 0.9845 - val_loss: 0.6190 - val_accuracy: 0.8332\n", "Epoch 8/10\n", "625/625 [==============================] - 214s 339ms/step - loss: 0.0384 - accuracy: 0.9875 - val_loss: 0.4624 - val_accuracy: 0.8776\n", "Epoch 9/10\n", "625/625 [==============================] - 163s 257ms/step - loss: 0.0260 - accuracy: 0.9912 - val_loss: 0.5290 - val_accuracy: 0.8764\n", "Epoch 10/10\n", "625/625 [==============================] - 184s 291ms/step - loss: 0.0191 - accuracy: 0.9939 - val_loss: 0.7019 - val_accuracy: 0.8642\n", "782/782 [==============================] - 142s 176ms/step - loss: 0.3108 - accuracy: 0.8718\n", "Test acc: 0.872\n" ] } ] }, { "cell_type": "markdown", "source": [ "## The Transformer encoder" ], "metadata": { "id": "hycMo2qy3uO_" } }, { "cell_type": "markdown", "source": [ "The encoder part of transformer can be used for text classification—it’s a very generic module that ingests a sequence and learns to turn it into a more useful representation. Let’s implement a Transformer encoder using Kera subclassing API." ], "metadata": { "id": "ozI7Y4wx6ScJ" } }, { "cell_type": "code", "source": [ "class TransformerEncoder(layers.Layer):\n", " def __init__(self, embed_dim, dense_dim, num_heads, **kwargs):\n", " super().__init__(**kwargs)\n", " self.embed_dim = embed_dim # Size of the input token vectors\n", " self.dense_dim = dense_dim # Size of the inner dense layer\n", " self.num_heads = num_heads # Number of attention heads\n", " self.attention = layers.MultiHeadAttention(\n", " num_heads=num_heads, key_dim=embed_dim)\n", " self.dense_proj = keras.Sequential(\n", " [layers.Dense(dense_dim, activation=\"relu\"),\n", " layers.Dense(embed_dim),]\n", " )\n", " self.layernorm_1 = layers.LayerNormalization()\n", " self.layernorm_2 = layers.LayerNormalization()\n", " # Computation goes in call().\n", " def call(self, inputs, mask=None): \n", " # The mask that will be generated by the Embedding layer will be 2D, but\n", " # the attention layer expects to be 3D or 4D, so we expand its rank. \n", " if mask is not None:\n", " mask = mask[:, tf.newaxis, :]\n", " attention_output = self.attention(\n", " inputs, inputs, attention_mask=mask)\n", " proj_input = self.layernorm_1(inputs + attention_output)\n", " proj_output = self.dense_proj(proj_input)\n", " return self.layernorm_2(proj_input + proj_output)\n", " # Implement serialization so we can save the model.\n", " def get_config(self):\n", " config = super().get_config()\n", " config.update({\n", " \"embed_dim\": self.embed_dim,\n", " \"num_heads\": self.num_heads,\n", " \"dense_dim\": self.dense_dim,\n", " })\n", " return config" ], "metadata": { "id": "3mjNWl3L5SFK" }, "execution_count": 105, "outputs": [] }, { "cell_type": "markdown", "source": [ "When you write custom layers, make sure to implement the `get_config` method: this enables the layer to be reinstantiated from its config dict, which is useful during model saving and loading.\n", "\n", "To add positional encoding, we’ll do something simpler and more effective: we’ll learn positionembedding vectors the same way we learn to embed word indices. We’ll then proceed to add our position embeddings to the corresponding word embeddings, to obtain a position-aware word embedding. This technique is called **“positional embedding.”** Let’s implement it. **It is noted that vector. \n", "neural networks don’t like very large input values, or discrete input distributions** therefore simply adding a position information as interger is not a good idea." ], "metadata": { "id": "-NEaOFJs7aTW" } }, { "cell_type": "code", "source": [ "class PositionalEmbedding(layers.Layer):\n", " # A downside of position embeddings is that the sequence length needs to be known in advance.\n", " def __init__(self, sequence_length, input_dim, output_dim, **kwargs):\n", " super().__init__(**kwargs)\n", " # Prepare an Embedding layer for the token indices.\n", " self.token_embeddings = layers.Embedding(\n", " input_dim=input_dim, output_dim=output_dim)\n", " # And another one for the token positions\n", " self.position_embeddings = layers.Embedding(\n", " input_dim=sequence_length, output_dim=output_dim)\n", " self.sequence_length = sequence_length\n", " self.input_dim = input_dim\n", " self.output_dim = output_dim\n", "\n", " def call(self, inputs):\n", " length = tf.shape(inputs)[-1]\n", " positions = tf.range(start=0, limit=length, delta=1)\n", " embedded_tokens = self.token_embeddings(inputs)\n", " embedded_positions = self.position_embeddings(positions)\n", " # Add both embedding vectors together\n", " return embedded_tokens + embedded_positions\n", " \n", " # Like the Embedding layer, this layer should be able to generate a\n", " # mask so we can ignore padding 0s in the inputs. The compute_mask\n", " # method will called automatically by the framework, and the\n", " # mask will get propagated to the next layer.\n", " def compute_mask(self, inputs, mask=None):\n", " return tf.math.not_equal(inputs, 0)\n", "\n", " def get_config(self):\n", " config = super().get_config()\n", " config.update({\n", " \"output_dim\": self.output_dim,\n", " \"sequence_length\": self.sequence_length,\n", " \"input_dim\": self.input_dim,\n", " })\n", " return config" ], "metadata": { "id": "kaYlKHpS7hgI" }, "execution_count": 106, "outputs": [] }, { "cell_type": "markdown", "source": [ "All you have to do to start taking word order into account is swap the old Embedding layer with our position-aware version." ], "metadata": { "id": "ac1jhOCK-W0q" } }, { "cell_type": "code", "source": [ "vocab_size = 20000\n", "sequence_length = 600\n", "embed_dim = 256\n", "num_heads = 2\n", "dense_dim = 32\n", "\n", "inputs = keras.Input(shape=(None,), dtype=\"int64\")\n", "x = PositionalEmbedding(sequence_length, vocab_size, embed_dim)(inputs)\n", "x = TransformerEncoder(embed_dim, dense_dim, num_heads)(x)\n", "x = layers.GlobalMaxPooling1D()(x)\n", "x = layers.Dropout(0.5)(x)\n", "outputs = layers.Dense(1, activation=\"sigmoid\")(x)\n", "model = keras.Model(inputs, outputs)\n", "model.compile(optimizer=\"rmsprop\",\n", " loss=\"binary_crossentropy\",\n", " metrics=[\"accuracy\"])\n", "model.summary()\n", "\n", "callbacks = [\n", " keras.callbacks.ModelCheckpoint(\"full_transformer_encoder.keras\",\n", " save_best_only=True)\n", "]\n", "model.fit(int_train_ds, validation_data=int_val_ds, epochs=20, callbacks=callbacks)\n", "model = keras.models.load_model(\n", " \"full_transformer_encoder.keras\",\n", " custom_objects={\"TransformerEncoder\": TransformerEncoder,\n", " \"PositionalEmbedding\": PositionalEmbedding})\n", "print(f\"Test acc: {model.evaluate(int_test_ds)[1]:.3f}\")" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "hA5Bdn6B-d-i", "outputId": "3c3ad580-9b4b-49cc-acb3-acee89a813dd" }, "execution_count": 107, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "Model: \"model_8\"\n", "_________________________________________________________________\n", " Layer (type) Output Shape Param # \n", "=================================================================\n", " input_11 (InputLayer) [(None, None)] 0 \n", " \n", " positional_embedding (Posit (None, None, 256) 5273600 \n", " ionalEmbedding) \n", " \n", " transformer_encoder (Transf (None, None, 256) 543776 \n", " ormerEncoder) \n", " \n", " global_max_pooling1d (Globa (None, 256) 0 \n", " lMaxPooling1D) \n", " \n", " dropout_8 (Dropout) (None, 256) 0 \n", " \n", " dense_20 (Dense) (None, 1) 257 \n", " \n", "=================================================================\n", "Total params: 5,817,633\n", "Trainable params: 5,817,633\n", "Non-trainable params: 0\n", "_________________________________________________________________\n", "Epoch 1/20\n", "625/625 [==============================] - 112s 174ms/step - loss: 0.4764 - accuracy: 0.7829 - val_loss: 0.2859 - val_accuracy: 0.8824\n", "Epoch 2/20\n", "625/625 [==============================] - 108s 172ms/step - loss: 0.2302 - accuracy: 0.9129 - val_loss: 0.2564 - val_accuracy: 0.8934\n", "Epoch 3/20\n", "625/625 [==============================] - 108s 173ms/step - loss: 0.1735 - accuracy: 0.9365 - val_loss: 0.2931 - val_accuracy: 0.8946\n", "Epoch 4/20\n", "625/625 [==============================] - 108s 173ms/step - loss: 0.1459 - accuracy: 0.9466 - val_loss: 0.3388 - val_accuracy: 0.8904\n", "Epoch 5/20\n", "625/625 [==============================] - 109s 174ms/step - loss: 0.1228 - accuracy: 0.9556 - val_loss: 0.4234 - val_accuracy: 0.8728\n", "Epoch 6/20\n", "625/625 [==============================] - 108s 172ms/step - loss: 0.1067 - accuracy: 0.9618 - val_loss: 0.2989 - val_accuracy: 0.8886\n", "Epoch 7/20\n", "625/625 [==============================] - 108s 173ms/step - loss: 0.0963 - accuracy: 0.9661 - val_loss: 0.3570 - val_accuracy: 0.8796\n", "Epoch 8/20\n", "625/625 [==============================] - 108s 173ms/step - loss: 0.0869 - accuracy: 0.9699 - val_loss: 0.4418 - val_accuracy: 0.8804\n", "Epoch 9/20\n", "625/625 [==============================] - 108s 173ms/step - loss: 0.0785 - accuracy: 0.9734 - val_loss: 0.3722 - val_accuracy: 0.8790\n", "Epoch 10/20\n", "625/625 [==============================] - 107s 172ms/step - loss: 0.0722 - accuracy: 0.9750 - val_loss: 0.3609 - val_accuracy: 0.8838\n", "Epoch 11/20\n", "625/625 [==============================] - 108s 172ms/step - loss: 0.0688 - accuracy: 0.9776 - val_loss: 0.4718 - val_accuracy: 0.8834\n", "Epoch 12/20\n", "625/625 [==============================] - 108s 172ms/step - loss: 0.0612 - accuracy: 0.9803 - val_loss: 0.4938 - val_accuracy: 0.8778\n", "Epoch 13/20\n", "625/625 [==============================] - 108s 173ms/step - loss: 0.0582 - accuracy: 0.9808 - val_loss: 0.4725 - val_accuracy: 0.8712\n", "Epoch 14/20\n", "625/625 [==============================] - 109s 173ms/step - loss: 0.0553 - accuracy: 0.9826 - val_loss: 0.4691 - val_accuracy: 0.8744\n", "Epoch 15/20\n", "625/625 [==============================] - 109s 173ms/step - loss: 0.0480 - accuracy: 0.9844 - val_loss: 0.5810 - val_accuracy: 0.8750\n", "Epoch 16/20\n", "625/625 [==============================] - 108s 173ms/step - loss: 0.0441 - accuracy: 0.9862 - val_loss: 0.5527 - val_accuracy: 0.8716\n", "Epoch 17/20\n", "625/625 [==============================] - 108s 173ms/step - loss: 0.0410 - accuracy: 0.9869 - val_loss: 0.5501 - val_accuracy: 0.8516\n", "Epoch 18/20\n", "625/625 [==============================] - 108s 172ms/step - loss: 0.0374 - accuracy: 0.9882 - val_loss: 0.5607 - val_accuracy: 0.8714\n", "Epoch 19/20\n", "625/625 [==============================] - 108s 172ms/step - loss: 0.0353 - accuracy: 0.9892 - val_loss: 0.6018 - val_accuracy: 0.8660\n", "Epoch 20/20\n", "625/625 [==============================] - 107s 172ms/step - loss: 0.0282 - accuracy: 0.9916 - val_loss: 0.7235 - val_accuracy: 0.8712\n", "782/782 [==============================] - 49s 62ms/step - loss: 0.2853 - accuracy: 0.8802\n", "Test acc: 0.880\n" ] } ] }, { "cell_type": "code", "source": [ "" ], "metadata": { "id": "9cSp01jOCD_R" }, "execution_count": null, "outputs": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.10" }, "nav_menu": {}, "toc": { "navigate_menu": true, "number_sections": true, "sideBar": true, "threshold": 6, "toc_cell": false, "toc_section_display": "block", "toc_window_display": false }, "colab": { "name": "04_Recurrent Neural Networks.ipynb", "provenance": [], "toc_visible": true, "collapsed_sections": [] }, "accelerator": "GPU" }, "nbformat": 4, "nbformat_minor": 0 }